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Abstract 

An important question in the study of constraint satisfaction problems (CSP) is understanding how the graph 
or hypergraph describing the incidence structure of the constraints influences the complexity of the problem. For 
binary CSP instances (i.e., where each constraint involves only two variables), the situation is well understood: the 
complexity of the problem essentially depends on the treewidth of the graph of the constraints Il27ll42ll . However, 
this is not the correct answer if constraints with unbounded number of variables are allowed, and in particular, for 
CSP instances arising from query evaluation problems in database theory. Formally, if H is a class of hypergraphs, 
then let CSP("H) be CSP restricted to instances whose hypergraph is in W. Our goal is to characterize those classes 
of hypergraphs for which CSP("H) is polynomial-time solvable or fixed-parameter tractable, parameterized by the 
number of variables. Note that in the applications related to database query evaluation, we usually assume that the 
number of variables is much smaller than the size of the instance, thus parameterization by the number of variables 
is a meaningful question. 

The most general known property of T~L that makes CSPCH) polynomial-time solvable is bounded fractional 
hypertree width. Here we introduce a new hypergraph measure called submodular width, and show that bounded 
submodular width of H (which is a strictly more general property than bounded fractional hypertree width) implies 
that CSP("H) is fixed-parameter tractable. In a matching hardness result, we show that if T~L has unbounded sub- 
modular width, then CSP("H) is not fixed-parameter tractable (and hence not polynomial-time solvable), unless the 
Exponential Time Hypothesis (ETH) fails. The algorithmic result uses tree decompositions in a novel way: instead 
of using a single decomposition depending on the hypergraph, the instance is split into a set of instances (all on the 
same set of variables as the original instance), and then the new instances are solved by choosing a different tree 
decomposition for each of them. The reason why this strategy works is that the splitting can be done in such a way 
that the new instances are "uniform" with respect to the number extensions of partial solutions, and therefore the 
number of partial solutions can be described by a submodular function. For the hardness result, we prove via a series 
of combinatorial results that if a hypergraph H has large submodular width, then a 3SAT instance can be efficiently 
simulated by a CSP instance whose hypergraph is H. To prove these combinatorial results, we need to develop a 
theory of (multicommodity) flows on hypergraphs and vertex separators in the case when the function b(S) defining 
the cost of separator S is submodular, which can be of independent interest. 
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1 Introduction 



There is a long line of research devoted to identifying hypergraph properties that make the evaluation of conjunctive 
queries tractable (see e.g. j23l 1501 126*1 I271 ). Our main contribution is giving a complete theoretical answer to this 
question: in a very precise technical sense, we characterize those hypergraph properties that imply tractability for the 
evaluation of a query. Efficient evaluation of queries is originally a question of database theory; however, it has been 
noted that the problem can be treated as a constraint satisfaction problem (CSP) and this connection led to a fruitful 
interaction between the two communities IT391 l25l [301 . Most of the literature relevant to the current paper use the 
language of constraint satisfaction. Therefore, after a brief explanation of the database-theoretic motivation, we switch 
to the language of CSPs. 

Conjunctive queries. Evaluation of conjunctive queries (or equivalently, Select-Project-Join queries) is one of 
the most basic and most studied tasks in relational databases. A relational database consists of a fixed set of rela- 
tions. A conjunctive query defines a new relation that can be obtained as first taking the join of some relations and 
then projecting it to a subset of the variables. As an example, consider a relational database that contains three rela- 
tions: enrolled (Person, Course, Date), teaches (Person, Course, Year), parent (Person l,Person2). The following query 
Q defines a relation ans(P) with the meaning that "P is enrolled in a course taught by her parent." 

Q : ans(P) <- enrolled (P,C,D) A teaches (P2,C, Y) A parent (P2,P). 

In the Boolean Conjunctive Query problem we need only to decide if the answer relation is empty or not, that is, if the 
join of the relations is empty or not. This is usually denoted as the relation "ans" not having any variables. Boolean 
Conjunctive Query contains most of the combinatorial difficulty of the general problem without complications such 
that the size of the output being exponentially large. Therefore, the current paper focuses on this decision problem. 

In a natural way, we can define the hypergraph of a query: its vertices are the variables appearing in the query 
and for each relation there is a corresponding edge containing the variables appearing in the relation. Intuitively, if the 
hypergraph has "simple structure," then the query is easy to solve. For example, compare the following two queries: 

Qi : aas ^ Ri(A,B,C) AR 2 (C,D) AR 3 (D,E,F) AR 4 (E ,F,G,H) AR 5 (H ,1) 
Q 2 : ans <- Ri(A,B) AR 2 (A,C) AR 3 (A,D) AR 4 {B,C) AR 5 (B,D) AR 6 {C,D) 

Even though more variables appear in Q\, evaluating it seems to be easier: its hypergraph is "path like," thus the query 
can be answered efficiently by, say, dynamic programming techniques. On the other hand, the hypergraph of Q 2 is a 
clique on 4 vertices and no significant shortcut is apparent compared to trying all possible combinations of values for 
(A,B,C,D). 

What are those hypergraph properties that make Boolean Conjunctive Query tractable? In the early 80s, it has been 
noted that acyclicity is one such property ||9j [I9j EJJ HI. Later, more general such properties were identified in the 
literature: for example, bounded query width |[T4l . bounded hypertree width 1251 . and bounded fractional hypertree 
width [41 , 28 j. Our goal is to find the most general hypergraph property that guarantees an efficient solution for query 
evaluation. 

Constraint satisfaction. Constraint satisfaction is a general framework that includes many standard algorithmic 
problems such as satisfiability, graph coloring, database queries, etc. 12511201 . A constraint satisfaction problem (CSP) 
consists of a set V of variables, a domain D, and a set C of constraints, where each constraint is a relation on a subset of 
the variables. The task is to assign a value from D to each variable in such a way that every constraint is satisfied (see 
Definition 12. II for the formal definition). For example, 3SAT can be interpreted as a CSP problem where the domain 
is D = {0, 1} and the constraints in C correspond to the clauses (thus the arity of each constraint is 3). As another 
example, let us observe that the ^-Clique problem (Is there a ^-clique in a given graph G?) can be easily expressed as a 
CSP instance. Let D be the set of vertices G, let V contain k variables, and let C contain (*) constraints, one constraint 
on each pair of variables. The binary relation of these constraints require that the two vertices are adjacent. Therefore, 
the CSP instance has a solution if and only if G has a ^-clique. 

It is easy to see that Boolean Conjunctive Query can be formulated as the problem of deciding if a CSP instance 
has a solution: the variables of the CSP instance correspond to the variables appearing in the query and the constraints 
correspond to the database relations. A distinctive feature of CSP instances obtained this way is that the number of 
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variables is small (as queries are typically small), while the domain of the variables are large (as the database relations 
usually contain a large number of entries). This has to be contrasted with typical CSP problems from AI, such as 
3-colorability and satisfiability, where the domain is small, but the number of variables is large. As our motivation is 
database-theoretic, in the rest of the paper the reader should keep in mind that we are envisioning scenarios where the 
number of variables is small and the domain is large. 

As the examples above show, solving constraint satisfaction problems is NP-hard in general if there are no addi- 
tional restrictions on the instances. The main goal of the research on CSP is to identify tractable special cases of the 
general problem. The theoretical literature on CSP investigates two main types of restrictions. The first type is to 
restrict the constraint language, that is, the type of constraints that are allowed. This direction includes the classical 
work of Schaefer lISTI and its many generalizations ifTUl [TT1 [T2l 1201 1381 . The second type is to restrict the structure 
induced by the constraints on the variables. The hypergraph of a CSP instance is defined to be a hypergraph on the 
variables of the instance such that for each constraint c G C there is a hyperedge e c containing exactly the variables 
that appear in c. If the hypergraph of the CSP instance has very simple structure, then the instance is easy to solve. 
For example, it is well-known that a CSP instance / with hypergraph H can be solved in time ||/||°( tw (#)) |[22l . where 
tw(H) denotes the treewidth of H and ||/|| is the size of the representation of / in the input. 

Our goal is to characterize the "easy" and "hard" hypergraphs from the viewpoint of constraint satisfaction. How- 
ever, formally speaking, CSP is polynomial-time solvable for every fixed hypergraph H: since H has a constant number 
k of vertices, every CSP instance with hypergraph H can be solved by trying all [|/[|* possible combinations on the k 
variables. It makes more sense to characterize those classes of hypergraphs where CSP is easy. Formally, for a class 
Ti of hypergraphs, let CSP("H) be the restriction of CSP where the hypergraph of the instance is assumed to be in Ti. 
For example, as discussed above, we know that if Ti is a class of hypergraphs with bounded treewidth (i.e., there is a 
constant w such that tw(H) < w for H G Ti), then CSP(%) is polynomial-time solvable. 

For the characterization of the complexity of CSP(%), we can investigate two notions of tractability. CSP(%) 
is polynomial-time solvable if there is an algorithm solving every instance of CSP(%) in time where ||/|| 

is the length of the representation of / in the input. The following notion interprets tractability in a less restrictive 
way: CSP(%) is fixed-parameter tractable (FPT) if there is an algorithm solving every instance / of CSP(%) in time 
f(H)(\\I\\) ^\ where / is an arbitrary function of the hypergraph H of the instance. Equivalently, the factor f(H) 
in the definition can be replaced by a factor f(k) depending only on the number k of vertices of H: as the number 
of hypergraphs on k vertices (without parallel edges) is bounded by a function of k, the two definitions result in the 
same notion. The motivation behind this definition is that if the number of variables is assumed to be much smaller 
than the the domain size, then we can afford even exponential dependence on the number of variables, as long as the 
dependence on the size of the instance is polynomal. For a more general treatment of fixed-parameter tractability, the 
reader is referred to the parameterized complexity literature |[T8ll2Tll45l . 

The case of bounded arities. If the constraints have bounded arity (i.e., the edge size in Ti is bounded by a constant 
r), then the complexity of CSP(%) is well understood. In this case, bounded treewidth is the only polynomial-time 
solvable case: 

Theorem 1.1 ([27 ]). If T-L is a recursively enumerable class of hypergraphs with bounded edge size, then (assuming 
FPT ^ W[l]j the following are equivalent: 

1. CSP(H) is polynomial-time solvable. 

2. CSP(H) is fixed-parameter tractable. 

3. Ti has bounded treewidth. 

The assumption FPT ^ W[l] is a standard hypothesis of parameterized complexity. Thus in the bounded arity 
case bounded treewidth is the only property of the hypergraph that can make the problem polynomial-time solvable. 
By definition, polynomial-time solvability implies fixed-parameter tractability, but Theorem 11.11 proves the surprising 
result that whenever CSP(%) is fixed-parameter tractable, it is polynomial-time solvable as well. 

The following sharpening of Theorem 11.11 shows that there is no algorithm whose running time is significantly 
better than the ||/||°( tw (#)) bound of the treewidth based algorithm, and this is true if we restrict the problem to any 
class Ti of hypergraphs. The result is proved under the Exponential Time Hypothesis (ETH) 051 . a somewhat stronger 
assumption than FPT ^ W[l]: it is assumed that there is no 2"^ time algorithm for n- variable 3SAT. 
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Theorem 1.2 ([42]). If there is a function f and a recursively enumerable class % of hypergraphs with bounded 
edge size and unbounded treewidth such that the problem CSPCH) can be solved in time f(H)\\I\\ o{ - tw ^ H ^ lo ^ w ^ for 
instances I with hypergraph H € H, then ETH fails. 

This means that the treewidth-based algorithm is almost optimal on every class of hypergraphs: in the exponent 
only an 0(logtw(//)) factor improvement is possible. It is conjectured in H2l that Theorem [L2]can be made tight, i.e., 
the lower bound holds even if the logarithmic factor is removed from the exponent. 

Conjecture 1.3 ( H42ID . IfH is a class of hypergraphs with bounded edge size, then there is no algorithm that solves 
CSPCH) in time /(//)||/||°( tw ( H )) for instances I with hypergraph H G H, where f is an arbitrary function. 

Unbounded arities. The situation is less understood in the unbounded arity case, i.e., when there is no bound on 
the maximum edge size in H. First, the complexity in the unbounded-arity case depends on how the constraints are 
represented. In the bounded-arity case, if each constraint contains at most r variables (r being a fixed constant), then 
every reasonable representation of a constraint has size |D|°t r ). Therefore, the size of the different representations can 
differ only by a polynomial factor. On the other hand, if there is no bound on the arity, then there can be exponential 
difference between the size of succinct representations (e.g., formulas [15]) and verbose representations (e.g., truth 
tables [44]). The running time of an algorithm is expressed as a function of the input size, hence the complexity of the 
problem can depend on how the input is represented: longer representation means that it is potentially easier to obtain 
a polynomial-time algorithm. 

The most well-studied representation of constraints is listing all the tuples that satisfy the constraint. This repre- 
sentation is perfectly compatible with our database-theoretic motivation: the constraints are relations of the database, 
and a relation is physically stored as a table containing all the tuples in the relation. For this representation, there 
are classes 7-1 with unbounded treewidth such that CSP restricted to this class is polynomial-time solvable. A trivial 
example is the class H of all hypergraphs having only a single hyperedge of arbitrary size. The treewidth of such 
hypergraphs can be arbitrarily large (as the treewidth of a hypergraph consisting of a single edge e is exactly \e\ — 1), 
but CSPCH) is trivial to solve: we can pick any tuple from the constraint corresponding to the single edge. There are 
other, nontrivial, classes of hypergraphs with unbounded treewidth such that CSPCH) is solvable in polynomial time: 
for example, classes with bounded (generalized) hypertree width [24], bounded fractional edge cover number l28l . and 
bounded fractional hypertree width ll28ll4Tll . Thus, unlike in the bounded-arity case, treewidth is not the right measure 
for characterizing the complexity of the problem. 

Our results. We introduce a new hypergraph width measure that we call submodular width. Small submodular 
width means that for every monotone submodular function b on the vertices of the hypergraph H, there is a tree 
decomposition where b(B) is small for every bag B of the decomposition. (This definition makes sense only if we 
normalize the considered functions: for this reason, we require that b(e) < 1 for every edge e of H.) The main result 
of the paper is showing that bounded submodular width is the property that precisely characterizes the complexity of 
CSP(H): 

Theorem 1.4 (Main). Let % be a recursively enumerable class of hypergraphs. Assuming the Exponential Time 
Hypothesis, CSPCH) parameterized by H is fixed-parameter tractable if and only ifH has bounded submodular width. 

Theorem 1 1.41 has an algorithmic side (algorithm for bounded submodular width) and a complexity side (hardness 
result for unbounded submodular width). Unlike previous width measures in the literature, where small value of the 
measure suggests a way of solving CSP(H) it is not at all clear how bounded submodular width is of any help. In 
particular, it is not obvious what submodular functions have to do with CSP instances. The main idea of our algorithm 
is that a CSP instance can be "split" into a small number of "uniform" CSP instances; for this purpose, we use a 
partitioning procedure inspired by a result of Alon et al. [4[. More precisely, splitting means that we partition the set of 
tuples appearing in the constraint relations in a certain way and each new instance inherits only one class of the partition 
(thus each new instance has the same set of variables as the original). Uniformity means that for any subsets BCAof 
variables, every solution for the problem restricted to B has roughly the same number of extensions to A. The property 
of uniformity allows us to bound the logarithm of the number of solutions on the different subsets by a submodular 
function. Therefore, bounded submodular width guarantees that each uniform instance has a tree decomposition such 
that in each bag only a polynomially bounded number of solutions has to be considered. 
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Conceptually, our algorithm goes beyond previous decomposition techniques in two ways. First, the tree decompo- 
sition that we use depends not only on the hypergraph, but on the actual constraint relations in the instance (we remark 
that this idea first appeared in P4l in a different context that does not directly apply to our problem). Second, we are 
not only decomposing the set of variables, but we also split the constraint relations. This way, we can apply different 
decompositions to different parts of the solution space. 

The proof of the complexity side of Theorem 1 1 . 41 folio ws the same high-level strategy as the proof of Theorem 1 1.2 1 
in ll42l . In a nutshell, the argument of [42] is the following: if treewidth is large, then there is subset of vertices which 
is highly connected in the sense that the set does not have a small balanced separator; such a highly connected set 
implies that there is uniform concurrent flow (i.e., a compatible set of flows connecting every pair of vertices in the 
set); the paths in the flows can be used to embed the graph of a 3SAT formula; and finally this embedding can be used 
to reduce 3SAT to CSP. These arguments build heavily on well-known characterizations of treewidth and results from 
combinatorial optimization (such as the 0(logk) integrality gap of sparsest cut). The proof of Theorem 1 1.4l follows this 
outline, but now no such well-known tools are available: we are dealing with hypergraphs and submodular functions 
in a way that was not explored before in the literature. Thus we have to build from scratch all the necessary tools. One 
of the main difficulties of obtaining Theorem 1 1.41 is that we have to work in three different domains: 

• CSP instances. As our goal is to investigate the existence of algorithms solving CSP, the most obvious domain 
is CSP instances. In light of previous results, we are especially interested in algorithms based on tree decompo- 
sitions. For such algorithms, what matters is the existence of subsets of vertices such that restricting the instance 
to any of these subsets gives an instance with "small" number of solutions. In order to solve the instance, we 
would like to find a tree decomposition where every bag is such a small set. 

• Submodular functions. Submodular width is defined in terms of submodular functions, thus submodular func- 
tions defined on hypergraphs is our second natural domain. We need to understand what large submodular width 
means, that is, what property of the submodular function and the hypergraph makes it impossible to obtain a tree 
decomposition where every bag has small value. 

• Flows and embeddings in hypergraphs. In the hardness proof, our goal is to embed the graph of a 3SAT for- 
mula into a hypergraph. Thus we need to define an appropriate notion of embedding and study what guarantees 
the existence of embeddings with suitable properties. As in 11421 . we use the paths appearing in flows to con- 
struct embeddings. For our purposes, the right notion of flow is a collection of weighted paths where the total 
weight of the paths intersecting each hyperedge is at most 1 . This notion of flows has not been studied in the 
literature before, thus we need to obtain basic results on such flows, such as exploring the duality between flows 
and separators. 

A key question is how to find connections between these domains. As mentioned above and detailed in Sectional 
we have a procedure that reduces a CSP instance into a set of uniform CSP instances, and the number of solutions on 
the different subsets of variables in a uniform CSP instance can be described by a submodular function. This method 
allows us to move from the domain of CSP instances to the domain of submodular functions. Section [5] is devoted 
to showing that if submodular width of a hypergraph is large, then there is a certain "highly connected" set in the 
hypergraph. Highly connected set is defined as a property of the hypergraph and has no longer anything to do with 
submodular functions. Thus this connection allows us to move from the domain of submodular functions to the study 
of hypergraphs. In Section [6j we show that a highly connected set in a hypergraph means that graphs can be efficiently 
embedded into the hypergraph. In particular, the graph of a 3SAT formula can be embedded into the hypergraph, which 
gives us (as shown in Section|7]) a reduction from 3SAT to CSP(%). This connection allows us to move from the domain 
of embeddings back to the domain of CSP instances. We remark that Sections |4]42] are written in a self-contained way: 
only the first theorem of each section is used outside the section. 

As a consequence of our characterization of submodular width, we obtain the surprising result that bounded sub- 
modular width equals bounded adaptive width (defined in [44]): 

Theorem 1.5. A class of hypergraphs has bounded submodular width if and only if it has bounded adaptive width. 

It is proved in [44] that there are classes of hypergraphs having bounded adaptive width (and hence bounded 
submodular width), but unbounded fractional hypertree width. Previously, bounded fractional hypertree width was the 



6 



most general property that was known to guarantee fixed-parameter tractability |[28l . Thus Theorem [L4]not only gives a 
complete characterization of the parameterized complexity of CSPCH), but its algorithmic side proves fixed-parameter 
tractability in a strictly more general case than what was known before. 

Why fixed-parameter tractability? We argue that investigating the fixed-parameter tractability of CSP(%) is 
at least as interesting as investigating polynomial-time solvability. In problems coming from our database-theoretic 
motivation, the size of the hypergraph (that is, the size of the query) is assumed to be much smaller than the input size 
(which is usually dominated by the size of the database), hence a constant factor in the running time depending only on 
the number of variables (or on the hypergraph) is acceptably Even the STOC 1977 landmark paper of Chandra and 
Merlin 021, which started the complexity research on conjunctive queries, suggests spending exponential time (in the 
size of the query) on finding the best possible evaluation order. Furthermore, the notion of fixed-parameter tractability 
formalizes the usual viewpoint of the literature on conjunctive queries: in the complexity analysis, we should analyze 
separately the contribution of the query size and the contribution of the database size. 

By aiming for fixed-parameter tractability, we can focus more on the core algorithmic question: is there some 
method for decomposing the space of all solutions in a way that allows efficient evaluation of the query? Some of the 
progress in this area was made by introducing new decomposition techniques, without showing how to actually find 
such decompositions. For example, this was the case for the papers introducing query width lfl4l and fractional hyper- 
tree width [28 ]: it was shown that if a certain type of decomposition is given, then the problem can be solved in poly- 
nomial time. In our terminology, these results already show the fixed-parameter tractability of CSP(H) for the classes 
H where such decompositions exist (since the time required to find an appropriate decomposition can be bounded by 
a function of H only), but do not give polynomial-time algorithms. It took some more time and effort to come up with 
polynomial-time (approximation) algorithms for finding such decompositions G31I4T1 . While investigating algorithms 
for finding decompositions give rise to interesting and important problems, they are purely combinatorial problems 
on graphs and hypergraphs, and no longer has anything to do with query evaluation, constraints, or databases. Thus 
fixed-parameter tractability gives us a formal way of ignoring these issues and focusing exclusively on the evaluation 
problem. 

On the complexity side, fixed-parameter tractability of CSPCH) seems to be a more robust question than polynomial- 
time solvability. For example, any polynomial-time reduction to CSP(%) should be able to pick a member of %, thus it 
seems that polynomial-time reduction to CSP(%) is only possible if certain artificial technical conditions are imposed 
on % (such as there is an algorithm efficiently generating appropriate members of W). Furthermore, there are classes Tl 
for which CSPCH) is polynomial-time equivalent to LOG Clique [27], thus we cannot hope to classify CSP("H) into 
polynomial-time solvable and NP-hard cases. Another difficulty in understanding polynomial-time solvability is that 
it can depend on the "irrelevant" parts of the hypergraph. Suppose for example that there is class % for which CSP(%) 
is not polynomial-time solvable, but it is fixed-parameter tractable: it can be solved in time f{H) ■ (||/||) 0(1) . Let W 
be constructed the following way: for every H £7~L, class %' contains a hypergraph H' that is obtained from H by 
adding a new component that is a path of length f(H). This new path is trivial with respect to the CSP problem, thus 
any algorithm for CSP(%) can be used for CSP(%') as well. Consider an instance / of CSP(%') having hypergraph H', 
which was obtained from hypergraph H. After taking care of the path, the assumed algorithm for CSP("H) can solve 
this instance in time f{H) ■ which is polynomial in ||/||: instance / contains a representation of H' , which 

has at least f(H) vertices, thus ||/|| is at least f(H). Therefore, CSP(%') is polynomial-time solvable. This example 
shows that aiming for polynomial-time solvability instead of fixed-parameter tractability might require understanding 
such subtle, but mostly irrelevant phenomena. 

In the hardness results obtained so far, evidence for the non-existence of polynomial-time algorithms is given not 
in the form of NP-hardness, but by giving evidence that the problem is not even fixed-parameter tractable. In Theo- 
rem ll.ll it is a remarkable coincidence that polynomial-time solvability and fixed-parameter tractability are equivalent. 
However, there is no reason to expect this to remain true in more general cases. Therefore, as discussed above, it makes 
sense to focus first on understanding the fixed-parameter tractability of the problem. 

'This assumption is valid only for evaluation problems (where the problem instance includes a large database) and not for problems that 
involves only queries, such as the Conjunctive Query Containment problem. 
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2 Preliminaries 



Constraint satisfaction problems. We briefly recall the most important notions related to CSP. For more background, 
see e.g., |[26ll20l . 

Definition 2.1. An instance of a constraint satisfaction problem is a triple (V,D,C), where: 

• V is a set of variables, 

• D is a domain of values, 

• C is a set of constraints, {c\,C2, ■ ■ ■ ,c q }. Each constraint c,- G C is a pair (sj,Ri), where: 

- Si is a tuple of variables of length m,-, called the constraint scope, and 

- Rj is an m,-ary relation over D, called the constraint relation. 

For each constraint (*,-,/? j) the tuples of indicate the allowed combinations of simultaneous values for the vari- 
ables in Si. The length m,- of the tuple Si is called the arity of the constraint. A solution to a constraint satisfaction 
problem instance is a function / from the set of variables V to the domain of values D such that for each constraint 
(si,Ri) with Sj = {vi 1 ,v iz ,...,v im ), the tuple (f(v h )J{v h ),...J(v im )) is a member of R t . We say that an instance is 
binary if each constraint relation is binary, i.e., m, = 2 for each constrainjl. It can be assumed that the instance does 
not contain two constraints (sj,Rj), (sj,Rj) with Si = sj, since in this case the two constraints can be replaced by the 
constraint (j; , R( CiRj). 

In the input, the relation appearing in a constraint is represented by listing all the tuples of the constraint. We 
denote by ||/|| the size of the representation of the instance / = (V,D,C). It can be assumed that ||/|| < D: elements of 
D that do not appear in any relation can be safely removed. 

Let / = (V,D,C) be a CSP instance and let V' C V be a nonempty subset of variables. If / is a solution of /, then 
pr^// is the projection of / to V', which is simply the restriction of the function / : V — >D to V' C V. If R is a set of 
solutions for /, then we let pr v iR = {pr v if \ f G R}. 

The projection pr v ,I of / to V' is a CSP /' = (V',D,C'), where C' is defined the following way: For each constraint 
c = ((vi, . . . ,Vk),R) having at least one variable in V', there is a corresponding constraint c' in C . Suppose that 
v,, , . . . ,v, f are the variables among vi,...,v& that are in V'. Then the constraint c' is defined as ((v,, , . . . ,Vi f ),R'}, where 
the relation R' is the projection of R to the components i\, . . . , ii, that is, R' contains an ^-tuple (d[ ,...,d' e ) G D l if and 
only if there is a &-tuple (d\,...,dk) ER such that d'j = d^ for 1 < j < I. Clearly, if / is a solution of /, then pr y // is a 
solution of pr v /7. For a subset V C V, we denote by so\j(V') the set of all solutions of pr v il. If the instance / is clear 
from the context, we drop the subscript. 

The primal graph (or Gaifman graph) of a CSP instance / = (V, D, C) is a graph with vertex set V such that u, v G V 
are adjacent if and only if there is a constraint whose scope contains both u and v. The hypergraph of a CSP instance 
/ = (V,D,C) is a hypergraph H with vertex set V, where e C V is an edge of H if and only if there is a constraint whose 
scope is e (more precisely, an \e\ -tuple s, whose coordinates form a permutation of the elements of e). For a class H of 
graphs, we denote by CSP(%) the problem restricted to instances whose hypergraph is in H. 

Graphs and hypergraphs. If G is a graph or hypergraph, then we denote by V(G) and E(G) the set of vertices 
and the set of edges of G, respectively. Vertices u,v G V(G) are adjacent if there is an edge e G E(G) with «,v £ e. 
A set K C V(G) is a clique if the vertices in K are pairwise adjacent. If H is a hypergraph and V' C V(H), then the 
subhypergraph induced by V' is a hypergraph //' with vertex set 5 and C e' C V' is an edge of H' if and only if there 
is an edge e G E(H) with e Pi V = e'. We denote by H \ S the subhypergraph of H induced by V (H) \ S. 

Paths, separators, and flows in hypergraphs. A path P in hypergraph H is an ordered sequence vo, vi, . . ., v r of 
vertices such that v; and v ; _i are adjacent for every 1 < i < r. We distinguish the endpoints of a path: vertex vo is the 
first endpoint of P and v r is the second endpoint of P. A path is an X — Y path if its first endpoint is in X and its second 
endpoint is in Y. A path P = v\V2 . . . v t is minimal if there are no shortcuts, i.e., v; and vj are not adjacent if \i — j\ > 1. 
Note that a minimal path intersects each edge at most twice. 

2 It is unfortunate that some communities use the notion "binary CSP" in the sense that each constraint is binary (as this paper), while other 
communities use it in the sense that the variables are 0-1, i.e., the domain size is 2. 
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Let H be a hypergraph and X,7C V(H) be two (not necessarily disjoint) sets of vertices. An (X ,Y)-separator is a 
set S C of vertices such that there is no (X\S) — (Y\ S) path in H \ S, or in other words, every X — Y path of H 
contains at least one vertex of S. In particular, this means that XflFCS. 

An assignment s : E(H) — > M + is a fractional (X,Y)-separator if every X — Y path P is covered by s, that is, 
T,eeE(H),enP=£tl) s ( e ) — 1- The weight of the fractional separator * is E ee £(H) 

Let // be a hypergraph and let V be the set of all paths in H. A flow of H is an assignment / : V — > M + such that 
UpeV.Pne^mfiP) < 1 f° r every e G E(H). The va/«e of the flow / is Lp e -p/(P). We say that a path P appears in flow 
/, or simply P is a paf/i of f if /(P) > 0. For some X,Y C V(H), an (X,Y)-flow is a flow / such that only X — F paths 
appear in /. A standard LP duality argument shows that the minimum weight of a fractional (X,F) -separator is equal 
to the maximum value of an (X, Y) -flow. 

If /,/' are flows such that f'(P) < f(P) for every path P, then /' is a subflow of /. The sum of the flows f\, f r 
is a mapping that assigns weight fi(P) to eacn P at h ^- Note that the sum of flows is not necessarily a flow itself. 
If the sum of f\ , f r happens to be a flow, then we say that f\ , . . . , f r are compatible. 

Highly connected sets. An important step in understanding various width measures is showing that if the measure 
is large, then the (hyper)graph contains a highly connected set (in a certain sense). We define here the notion of highly 
connected that will be used in the paper. First, recall that & fractional independent set of a hypergraph H is a mapping 
jU : V(H) — > [0, 1] such that £ v6e , pt (v) < 1 for every e G E(H). We extend functions on the vertices of H to subsets 
of vertices of H the natural way by setting jti(X) := £ VG x M( v )> thus is a fractional independent set if and only if 
H(e) < 1 for every e G E(H). 

Let ^ be a fractional independent set of hypergraph H and let X > be a constant. We say that a set W C V{H) 
is {pt ,X)-connected if for any two disjoint sets A,B C W, the minimum weight of a fractional (A, B) -separator is at 
least X •min{ju(A),/x(B)}. Note that if W is (ju, A) -connected, then W is (ju, A')-connected for every X' < X and every 
W' C W is also (ju, A)-connected. Informally, if IV is (ju, A)-lambda connected for some fractional independent set /I 
such that n(W) is "large", then we call W a highly connected set. For X > 0, we denote by con^(H) the maximum of 
jlt(W), taken over every (ju, X) -connected set W of H. Note that if X' < X, then con^/(//) > con^(H). Throughout the 
paper, X can be thought of as a sufficiently small universal constant, say, 0.001. 

Embeddings. The hardness result presented in the paper and earlier hardness results for CSP(%) ll27l l44l [42) 
are based on embedding some other problem (with a certain graph structure) in a CSP instance whose hypergraph 
is a member of %. Thus we need appropriate notions of embedding a graph in a (hyper)graph. Let us first recall the 
definition of minors in graphs. A graph H is a minor ofGifH can be obtained from G by a sequence of vertex deletions, 
edge deletions, and edge contractions. The following alternative definition is more relevant from the viewpoint of 
embeddings: a graph F is a minor of G if there is a mapping y that maps each vertex of F to a connected subset of 
V(G) such that i//(n)ni|A(v) =0for u ^ v, and if u,v G V(F) are adjacent inP, then there is an edge inP(G) connecting 
V(k) and y/(v). 

A crucial difference between the proof of Theorem II. II in [27 ] and the proof of Theorem 1 1.21 in [42] is that the 
former result is a based on finding a minor embedding of a grid, while the latter result uses an embedding where the 
images of distinct vertices are not necessarily disjoint, but can overlap in a controlled way. We define such embeddings 
the following way. We say that two sets of vertices X , Y C V (H) touch if either X n Y ^ 0, or there is an edge e G E(H) 
intersecting both X and Y. An embedding of graph G into hypergraph H is a mapping y that maps each vertex of H 
to a connected subset of V(G) such that if u and v are adjacent in G, then y(u) and y(v) touch. The depth of a vertex 
v G V(H) in embedding i/a is dy{y) := |{« G V(G) | v G y(u)} |, the number of vertices of G whose images contain v. 
The vertex <ie/j?/j of the embedding is max ve y(m dy{v). Observe that \j/ is a minor mapping if and only if it has vertex 
depth 1 . Because in our case we want to control the size of the constraint relations, we need a notion of depth that is 
sensitive to "what the edges see." We define edge depth of y to be max eeB (#) Evee*V( v )- Equivalently, we can define 
edge depth as the maximum of ^vev(G) \ w( v ) ^ e l> taken over all edges of e of H. 

Trivially, for any graph G and hypergraph H, there is an embedding of G into H having vertex depth and edge 
depth at most |F(G)|. If G has m edges and no isolated vertices, then |V(G)| is at most 2m. We are interested in how 
much we can gain compared to this trivial solution of depth 0(m). We define the embedding power emb(//) to be the 
maximum (supremum) value of a for which there is an integer m a such that every graph G with m > m a edges has an 
embedding into H with edge depth m/a. It might look unmotivated that we define embedding power in terms of the 
number of edges of G: defining it in terms of the number of vertices might look more natural. However, if we replace 
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the number m of edges with the number n of vertices in the definition, then the worst case occurs if H is a clique on n 
vertices. Such a definition would describe how well cliques can be embedded, and would give us no information about 
how sparse graphs can be embedded. 

3 Width parameters 

Treewidth and its various generalizations are defined in this section. We follow the framework of width functions 
introduced by Adler (H. A tree decomposition of a hypergraph H is a tuple (T, (Bt)tev(T))> where T is a tree and 
(Bt)tev(T) i s a family of subsets of V(H) satisfying the following two conditions: (1) for each e G E{H) there is a node 
t G V(T) such that e C B t , and (2) for each v G V(H) the set {t G V(T) \ v G B t } is connected in T. The sets B t are 
called the bags of the decomposition. Let / : 2 V ^ — > M + be a function that assigns a nonnegative real number to each 
nonempty subset of vertices. The f -width of a tree-decomposition (T, (B t ) teV r T \) is max {f(B t ) \ t G V(T)}. The f- 
width of a hypergraph H is the minimum of the /-widths of all its tree decompositions. In other words, /-widfh(//) < w 
if and only if there is a tree decomposition of H where f(B) < w for every bag B. 

The main idea of tree decomposition based algorithms is that if we have a tree decomposition for instance / such 
that at most C assignments on B t have to be considered for each bag B t , then the problem can be solved by dynamic 
programming in time polynomial in C and ||/||. The various width notions try to guarantee the existence of such 
decompositions. The simplest such notion, treewidth, can be defined as follows: 

Definition 3.1. Let s(B) = \B\ - 1. The treewidth of H is tw(H) := s-width(H). 

Further width notions defined in the literature can also be conveniently defined using this setup. A subset E' C.E(H) 
is an edge cover if \JE' = V(H). The edge cover number p(H) is the size of the smallest edge cover (here we assume 
that H has no isolated vertices). For X C V(H), let Ph(X) be the size of the smallest set of edges covering X. 

Definition 3.2. The generalized hypertree width of H is hw(//) := p//-width(//). 

The original (nongeneralized) definition |[23l of hypertree width includes an additional requirement on the decom- 
position (we omit the details), thus it cannot be less than generalized hypertree. However, it is known that hypertree 
width and generalized hypertree width can differ by at most a constant factor El . 

Grohe and Marx |[28l further generalized hypertree width by considering linear relaxations of edge covers. A 
function y : E(H) — > [0, 1] is & fractional edge cover of H if £e:vee Y( e ) — 1 f° r every v G V(H). The fractional cover 
number p* (H) of H is the minimum of Y,eee(H) 7( e ) taken over all fractional edge covers of H (it is well known that this 
minimum is achieved by some rational y). We define p#(X) analogously to p#(X): the requirement Y*e.v&e 7( e ) > 1 is 
restricted to vertices v G X. 

Definition 3.3. The fractional hypertree width of H is fhw(H) := p^-width(//). 

A crucial idea in B4l is to make the choice of tree decomposition adaptive: instead of assigning a single decom- 
position to each hypergraph, we choose the best decomposition based on additional properties of the current instance. 
Motivated by this idea, we generalize the notion of /-width from a single function / to a class of functions T . Let 
J 7 be an arbitrary (possibly infinite) class of functions that assign nonnegative real numbers to nonempty subsets of 
vertices. The T -width of a hypergraph H is J 7 -width(//) := sup |/-width(//) | / G J 7 }. Thus if J r -width(//) < k, then 
for every / G T, hypergraph H has a tree decomposition with /-width at most k. Note that this tree decomposition 
can be different for the different functions /. For normalization purposes, we consider only functions / on V (H) that 
satisfy /(0) = and that are edge-dominated, that is, f(e) < 1 holds for every e G E(H). 

Using these definitions, we can define adaptive width, introduced in B4l . as follows. Recall that in Section |2j we 
stated that if jj, is a fractional independent set, then jx is extended to subsets of vertices by defining ju(X) := Lvex M(v) 
for every X QV(H). 

Definition 3.4. The adaptive width adw(//) of a hypergraph H is J r -width(//), where T is the set of all fractional 
independent sets of H. 
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A function / : 2 V ( H ) — y M is modular if f(X) = Y*vex c \> for some constants c v (v G V(H)). The function ju(X) 
arising from a fractional independent set is clearly a modular and edge dominated function, in fact, in Definition 13.41 
we can define F as the set of all nonnegative modular edge-dominated functions on V(H). The main new definition 
of the paper is a new width measure, which is obtained by imposing a requirement weaker than modularity on the 
functions in F (hence the considered set T of functions is larger): 

Definition 3.5. A function b : 2 V W R+ is submodular if b(X) + b(Y) > b(X n Y) + b(X U Y) holds for every 
X,Y C V(H). Given a hypergraph H, let F contain every edge-dominated monotone submodular function b on V(H) 
with b(@i) = 0. The submodular width subw(//) of hypergraph H is J r -width(//). 

It is well-known that submodular functions can be equivalently characterized by the property that b(X U v) — b(X), 
the marginal value of v with respect to X, is a nonincreasing function of X. That is, for every v and XQ, 

b(XUv)-b(X) >b(YUv)-b(Y). (1) 

It is clear that subw(//) > adw(//): Definition 13 . 5 1 considers a larger set of functions. Furthermore, we show that 
subw(//) is at most the fractional hypertree width of H. This is a straightforward consequence of the fact that an 
edge-dominated submodular function is always bounded by the fractional cover number: 

Lemma 3.6. Let H be a hypergraph and b be a monotone edge-dominated submodular function with Z?(0) = 0. Then 
b{S) < p* H {S) for every S C V{H). 

Proof. The statement can be proved along the same lines as the proof of Shearer's Lemma |[T6ll attributed to Radhakr- 
ishnan goes. It is sufficient to prove the statement for the case S = V(H): otherwise, we can consider the subhypergraph 
of H induced by S and the function b restricted to 5. Let y:E(H) — > M. + be a minimum fractional edge cover of S. 
Let vi, v n be an arbitrary ordering of V(H) and let Vj = {vi,...,v;}, Vb = 0- For every e G E(H), we have 
b(e) = Lv,ee(^(<?rW/) —b(eV\Vi-\)) > E Vi ee(^(^') — &(V/_i))) (the equality is a simple telescopic sum; the inequality 
uses (Q]), i.e., the marginal value of v; with respect to V;_i is not greater than with respect to efl^-i). 

P*h(V(H))= £ 7(e) > £ Y{e)b(e)> £ y(e) £ (b(V t ) -fc(VJ_i)) 

e£E(H) eeE(H) eeE(H) v,-ee 

= I>CV0 -M^-i)) I r(^) > L(*(v,) -Mvi-O) = *(v(»)) 

i'=l eeE(H),v t ee i=l 

(in the first inequality, we use that / is edge dominated; in the last inequality, we use that 7 is a fractional edge 
cover). □ 

Proposition 3.7. For every hypergraph H, subw(//) < fhw(//). 

Proof. Let (T,B teV ^ T ^) be a tree decomposition of H whose p^-width is fhw(H). If b is an edge-bounded monotone 
submodular function, then by Lemma 1331 b(B t ) < p^{B t ) < fhw(H) for every bag B t of the decomposition, i.e., 
ft-widfh(//) < fhw(//). This is true for every such function b, hence subw(//) < fhw(H). □ 

Since adw(//) < subw(//) < fhw(//), if a class % of hypergraphs has bounded fractional hypertree width, then it 
has bound submodular width, and if a class H has bounded submodular width, then it has bounded adaptive width. 
Surprisingly, it turns out that the latter implication is actually an equivalence: Corollary 16. lOl shows that subw(//) is at 
most 0(adw(//) 4 ), thus a class of hypergraphs has bounded submodular width if and only if it has bounded adaptive 
width. In other words, large submodular width can be certified already by modular functions: if submodular width is 
unbounded in % and we want to choose an H G H and a submodular function b such that the ft-width of H is larger 
than some constant k, then we can choose H and b such that b is actually modular. There is no intuitive reason why 
this is true: submodular functions seem to be much more powerful than modular functions. Still, we obtain this result 
as a byproduct of characterizing submodular. 

There is no such connection between adaptive width and fractional hypertree width: it is shown in [44 ] that there 
is a class of hypergraphs with bound adaptive width and unbounded fractional hypertree width. Thus the property 
bounded fractional hypertree width is a strictly weaker property than bounded adaptive/submodular width. 
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Figure 1: Hypergraph properties that make CSP fixed-parameter tractable. 



Figure[j]shows the relations of the hypergraph properties defined in this section (note that the elements of this Venn 
diagram are sets of hypergraphs; e.g., the set "bounded tree width" contains every set % of hypergraphs with bounded 
tree width). As discussed above, all the inclusions in the figure are proper. 

Finally, let us remark that there have been investigations of tree decompositions and branch decompositions of sub- 
modular functions and matroids in the literature Il33ll4"7ll34"ll32l l5l. However, in those results the submodular function 
is a connectivity function: b(S) describes the boundary of S, that is, the cost of separating S from its complement. In 
our case, b(S) describes the cost of the separator S itself. Therefore, we are in a completely different setting and the 
previous results cannot be used at all. 

4 From CSP instances to submodular functions 

In this section, we prove the main algorithmic result of the paper: CSP(H) is fixed-parameter tractable if % has bounded 
submodular width. 

Theorem 4.1. Let Ti.be a class of hypergraphs such that subw(//) < cofor every H € T-L. Then CSP(H) can be solved 
intime2^° mm .\\I\\°M. 

The proof of Theorem l4.1l is based on two main ideas: 

1 . A CSP instance / can be decomposed into a bounded number of "uniform" CSP instances I\ , . . . , I t (Lemma f4. 10b . 
Here uniform means that if B C A are two sets of variables, then every solution of pr B /; has roughly the same 
number of extensions to pr A Ij. 

2. If / is a uniform CSP instance, then (the logarithm of) the number of solutions on the different projections of / 
can be described by an edge-dominated submodular function (Lemma l4.11l) . Therefore, if the hypergraph H of 
/ has bounded submodular width, then it follows that there is a tree decomposition where every bag has a small 
number of solutions. 

Recall from Section [2] that pr A 7 is instance / projected to a set A of variables and sol/ (A) is the set of all solution 
of pr A I. In the implementation of the first idea (Lemma [4. 101) . we guarantee uniformity only to subsets of variables 
that are "small" in the following hereditary sense (not that in general it is possible that | sol/(5')| > | sol/(5)| for some 
5' C 5): 

Definition 4.2. Let / be a CSP instance and M > 1 an integer. We say that S C V is M-small if | sol/(5') | < M for every 
S'CS. 
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It is not difficult to find all the M-small sets, and every solution of the instances projected onto these sets: 

Lemma 4.3. Let I = (V,D,C) be a CSP instance and M > 1 an integer. There is an algorithm with running time 
2°(l y D . ^o/j(||/||,M) that finds the set S of all M-small sets S C V and constructs solj(S)for each such S G S. 

Proof. For i = 1,2, ... , |V|, let us find every M-small set of size i. This is trivial to do for i = 1. Suppose that we have 
already found the set 5, of all M-small sets of size exactly i. By definition, every size i subset S of an M-small set 
S of size i + 1 is an M-small set. Thus we can find every M-small set of size i + 1 by enumerating every 5 G Sj and 
checking for every v G V \S whether S' := SU {v} is M-small. To check whether S' is M-small, we first check whether 
every subset of size i is M-small, which is easy to do using the set 5,-. Then we construct sol/(S"): this can be done 
by enumerating every tuple s G sol/ (S) and every extension of s by a new value from D. Thus we need to consider at 
most |sol/(5')| ■ \D\ < M ■ \D\ tuples as possible members in sol/(5 ,/ ), which means that 801/(5') can be constructed in 
time polynomial inM and ||/||. If | sol/(5')| < M, then we put 5' into As the size of each set Sj is at most 2^^ and 
every operation is polynomial in M and ||/||, the total running time is 2°^ v ^ -polydl/^M) for an appropriate function 
/• ' " □ 

The following definition gives the precise notion of uniformity that we use: 

Definition 4.4. Let / = (V,D,C) be a CSP instance. For B C A C V and an assignment b : B -)■ D, let sol/(A|B = b) := 
{a G sol/(A) | pr e a = pr B &}, the set of all extensions of b to a solution of pr A 7. Let max/(A|B) = max iGsol/ ( B ) | sol/(A|fi = 
b)\. We say that A C V is c-uniform (for some integer c) if, for every BCA, 

max/(A|B) < c\ sol/(A)|/| sol/(5)|. 

We define max/(A|0) = | sol/(A)| and max/(0|0) = 1. We will drop / from the subscript of max if it is clear from the 
context. A CSP instance is (N,c,e) -uniform if every A^ c -small set is Af e -uniform. 

Let us prove two straightforward properties of the function max(A|B): 

Proposition 4.5. For every fiCACy and C QV,we have 

1. max(A|fl) > |sol(A)|/|sol(B)|, 

2. max(A|B) > max(AUC|BUC). 

Proof. If every b G sol(B) has at most max(A|B) extensions to A, then clearly | sol(A)| is at most | sol(B)| • max(A|B), 
proving the first statement. To show the second statement, consider anx G sol(SUC) with max(AUC|BUC) extensions 
to AUC. For any two yi,y>2 G sol(AUC|BUC = x) with vi ^y%, we have pr c yi = pr c j2 = Wc x > hence y\ andj2 can be 
different only if pr A y\ ^ pr A y^. This means that pr A vi and pr A ^2 are two different extensions of pr B x to A. Therefore, 

max(A|B) > | sol(A|B = pr B x)| > |sol(AUC|BUC = x)| = max(AUC|BUC), 

what we had to show. □ 

Notice that (2) in Prop. 14.51 gives a hint that submodularity will be relevant: it is analogous to inequality (Q]) 
expressing that marginal value is larger with respect to a smaller set. 

We want to avoid dealing with assignments b G sol(fi) that cannot be extended to a member of sol(A) for some 
(that is, sol(A|B = b) = 0). Of course, there is no easy way to avoid this in general (or even to detect if there is 
such a b): for example, if A is the set of all variables, then we would need to check if b can be extended to a solution. 
Therefore, we require that there is no such unextendable b only if A and B are M-small: 

Definition 4.6. A CSP instance is M-consistent if sol(B) = pr B sol(A) for all M-small sets BCA. 

The notion of M-consistency is very similar to ^-consistency, a standard notion in the constraint satisfaction lit- 
erature (7J (TTJ [40]. However, we restrict the considered subsets not by the number of variables, but by the number 
of solutions (more precisely, by considering only M-small sets). Similarly to usual ^-consistency, we can achieve M- 
consistency by throwing away partial solutions that violate the requirements: if we use the algorithm of Lemma 1431 
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to find all possible assignments of the M-small sets, then we can check if there is such an unextendable b for some 
M-small sets A and B. If there is such a b, then we can exclude it from consideration (without losing any solution of the 
instance) by introducing a new constraint on B. By repeatedly excluding the unextendable assignments, we can avoid 
all such problems. We say that /' = (V,D,C') is a refinement of I = (V,D,C) if for every constraint (s,R) G C, there is 
a constraint (s,R') G C such that R' C R. 

Lemma 4.7. Let I = (V,D,C) be a CSP instance and M > 1 an integer. There is an algorithm with running time 
2°(l y D • poZy(||/||,M) that produces an M-consistent CSP instance I' that is a refinement of I with sol(7) = sol(/'). 

Proof. Using the algorithm of Lemma 14.31 we can find all the M-small sets and then we can easily check if there 
are two M-small sets S C S' violating consistency, i.e., sol(S) % pr 5 sol(5"). In this case, let us add the constraint 
(S,pr s sol (£')); it is clear that sol(F) does not change but |sol(5)| strictly decreases. We repeat this step until the 
instance becomes M-consistent. Note that adding the new constraint can make a set M-small that was not M-small 
before, thus we need to rerun the algorithm of Lemma 1431 To bound the number of iterations before M-consistency is 
reached, observe that adding a new constraint does not increase | sol(A)| for any A and strictly decreases | sol(5)| for 
some M-small set S. As there are at most sets S and | 801(5)1 < M for every M-small set S, it follows that this step 
can be repeated at most 2l v l -M times. Thus the total time required to ensure that instance / is M-consistent can be 
bounded by 2°^ -poly(||/||,M). □ 

We want to avoid degenerate cases where there is no solution for trivial reasons. A CSP instance is nontrivial if 
sol({v}) 7^ for every v G V. 

Proposition 4.8. If I is an M-consistent nontrivial CSP instance, then sol(S) ^ d)for every M-small set S. 

It is well known that by achieving ^-consistency, we can solve CSP instances with treewidth k: the key observation 
is that if an instance / with treewidth at most k has a ^-consistent nontrivial refinement /', then / has a solution. The 
following lemma adapts this statement to our setting. 

Lemma 4.9. Let 1 = (V,D,C) be a CSP instance andM > 1 an integer. Let I' be an M-consistent nontrivial refinement 
of I. If the hypergraph H of I has a tree decomposition where every bag B is M-small in /', then I has a solution. 

Proof. Suppose that there is such a tree decomposition (T,(B t ) teV r T \). Assume that T is rooted and for every node 
t G V(T), let V t be the union of the bags that are descendants of t (including B t ). We claim that every assignment in 
sol// (B t ) can be extended to an assignment of V t that satisfies every constraint of / whose scope is fully contained in V t . 
Applying this statement to the root of T proves that there exists a solution for /. 

We prove the claim for every node of T in a bottom up order. The statement is trivial for the leaves. Let t\ , 
t[ be the children of t and suppose the claim is true for these nodes. Consider an assignment g G sol//(B f ). Since /' is 
^-consistent and B tj is Af c -small, assignment g\B t c\B t . can De extended to an assignment g, G solji(B t .). As the claim 
is true for node assignment gj can be extended to an assignment g\ of V tj . The assignments g, g\, g 1 , can be 
combined to obtain an assignment g' on V t (note that this is well defined: the intersection of V tj and V t is in V t , which 
means that a variable appearing in both V tj and V tj has the same value in g, g\, and g'j). Furthermore, every edge e of H 
that is fully contained in V t is fully contained in at least one of B t , V t] , . . . , V tl , and the corresponding assignment g, g\, 
. . . , g\ shows that g' satisfies the constraint corresponding to e. □ 

Note the subtle detail that Lemma Fk9l does not claim that /' has a solution. Furthermore, when Lemma 14771 creates 
an M-consistent instance, then it possibly adds many new constraints and the hypergraph of /' can be very dense even 
if the hypergraph of / has nice structure. However, this is not a problem, as Lemma l4!9l does not require any property 
on the hypergraph of 

4.1 Decomposition into uniform CSP instances 

Our algorithm for decomposing a CSP instance into uniform CSP instances is inspired by a combinatorial result of 
Alon et al. [4], which shows that, for every fixed n, an n-dimensional point set S can be partitioned into polylog(|5|) 
classes such that each class is 0(l)-uniform. We follow the same proof idea: the instance is split into two instances if 
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uniformity is violated somewhere, and we analyze the change of an appropriately defined weight function to bound the 
number of splits performed. However, the parameter setting is different in our proof: we want to partition into 
classes, but we are satisfied with somewhat weaker uniformity. Another minor technical difference is that we require 
uniformity only on the Af c -small sets. 

Lemma 4.10. Let I = (V,D,C) be a CSP instance, let N an be an integer, and let c > 1, e > real numbers. There is an 
algorithm with running time 2 2 ° v ' c / e • poly (||/|| ,iV c ) that produces a set of (N ,c,e)-uniform N c -consistent nontrivial 
instances I\, . . ., I t with < t < 2 2 ° v ' ,c / e ) all on the set V of variables, such that 

1. every solution of I is a solution of exactly one instance I{, 

2. for every 1 <i<t, instance Ij is a refinement of I. 

Proof. The main step of the algorithm takes a CSP instance / and either makes it (N,c, e)-uniform and Af c -consistent, 
or splits it into two instances 7 S maii> /large- By applying the main step recursively on 7 sma ii and /large. we eventually 
arrive to a set of (N,c,e) -uniform ^-consistent instances. We will argue that the number of constructed instances is 

2 2°(W\)-c/e 

In the main step, we first check if the instance is trivial; in this case we can stop with t = 0. Otherwise, we 
invoke the algorithm of Lemma 14.71 to obtain an N c -consistent refinement of the instance, without losing any so- 
lution. Next we check if this ^-consistent instance / is (Af,c,£)-uniform. This can be tested in time 2°^ v ^ ■ 
poly(||/||,iV c ) if we use Lemma 1431 to find all the Af c -small sets and the corresponding sets of solutions. Sup- 
pose that Af c -smail sets 6C/1 violate uniformity, that is, max(A|5) > N e \ sol(A)|/| sol(fi)|. Let sol sma u(B) contain 
those tuples b for which |sol(A|5 = b)\ < y/N 5 | sol(A)|/| sol(B)| and let sol large (B) = sol(fi) \ sol sma u(S). Note that 
I sol(A)| > | sol large (fi)| • (y/N*\ sol(A)|/| sol(B)|) (as every tuple b £ sol large (fi) has at least \/N s \ sol(A)|/| sol(fi)| ex- 
tensions to A), hence | soli arge (B)| < | so\(B)\ / y/N^ . Let instance / S maii (resp., 7i arge ) be obtained from / by adding the 
constraint (B, sol sma u(B)) (resp., (5, soliarge (#)))■ Clearly, the set of solutions of / is the disjoint union of the sets of 
solutions of / sma n and /i arge . This completes the description of the main step. 

It is clear that if the recursive procedure stops, then the instances at the leaves of the recursion satisfy the two 
requirements. We show that the height of the recursion tree can be bounded from above by a function h(\V\,c,e) 
depending only on \V\, c, and e; in particular, this shows that the recursive algorithm eventually stops and produces at 
most 2 /> (l y l ,c,e ) instances. 

Let us consider a path in the recursion tree starting at the root, and let I 1 , I 2 , P be the corresponding N c 
consistent instances. If a set S is A^ c -small in V , then it is A^ c -small in V for every / > j: the main step cannot increase 
| sol (5) | for any S. Thus, with the exception of at most 2l y l values of j, instances V and / ;+1 have the same A^'-small 
sets. Let us consider a subpath I x , P such that all these instances have the same A^ c -small sets. We show that the 
length of this subpath is at most 0(3l v l • c/e), hence p = 0(2^^ ■ 3' y l - c/e). As this holds for any path starting at the 
root, we obtain that the height of the recursion tree is 2°^ v ^ ■ c/e and hence t = 2 2 ° ( V lc / e . 

For the instance V , let us define the following weight: 



We bound the length of the subpath P, ...,Pby analyzing how this weight changes in each step. Observe first that when 
invoking the algorithm of Lemma 14771 to find an Af c -consistent refinement, then the weight does not increase: adding 
new constraints cannot increase max(A|fi) for any A,B C.V and cannot create new N c small sets by the assumption 
on x and y. Thus it is sufficient to analyze how the weight decreases in /large and / sma ii compared to /. Note that 
< W-i < 3' v 'logA^ c = 3' y ! •clog./V: the sum consists of at most 3' v ' terms and (as A is Af c -small and the instance 
V is _/V c consistent and nontrivial) max 7 j(A|B) is between 1 and Af c . We show that < W y — (e/2)logiV, which 
immediately implies that the length of the subpath is 0(3' y ' • c/e). Let us inspect how VK y+1 changes compared to WL 
Since V and / y+1 have the same _/V c -smail sets, no new term can appear in W J+1 . It is clear that max /I+ i (A\B) cannot be 
greater than max//(A|fi) for any A,B. However, there is at least one term that strictly decreases. Suppose first that/ ;+1 
was obtained from V by adding the constraint (fi,sol sma ii (/?))■ Then 



W J 



logmax/j (A\B). 



0CBCACV 
A,B are N c - small in V 
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On the other hand, if V +1 was obtained by adding the constraint (BjSoliarg^B)}, then 

logmax /y+ i(B|0)=log|sol /J+ i(B)| < log(| sol 7J (fi)|/v^) = logmax 7 ;(B|0) - (e/2)logiV. 
In both cases, we get that at least one term decreases by at least (e /2) logAf. □ 

4.2 Uniform CSP instances and submodularity 

Assume for a moment that we have a 1-uniform instance / with hypergraph H. Note that by Prop l4.5t l). this means that 
max(A|fi) = | sol(A)|/| sol(S) | . Suppose that every constraint contains at most N tuples and let us define the function 
b (5) = log^ | 801(5)1- For every edge e G E(H), there is a corresponding constraint, which has at most ,/V tuples by the 
definition of N. Thus | sol(e)| < N and hence b{e) < 1 for every e G E(H), that is, b is edge dominated. The crucial 
observation of this section is that this function b is submodular: 

K*)+W=i°g*Mx)|+iog„^ 

> log N | sol(X) | + log N (| sol(X n Y) | • max(X U Y\X)) = log N (| sol(X) | ) + lo &v I sol(X D Y) \ 



|sol(X)| 



b(X(lY) + b(XUY) 



(the equalities follow from 1-uniformity; the inequality uses Prop. 14.5( 2) with A = Y, B = XHY,C = X). Therefore, if 
the submodular width of H is at most c, then H has a tree decomposition where b(B) < c and hence | sol(B)| < N c for 
every bag B. Thus we can find a solution of the instance by dynamic programming in time polynomial in N c . 

Lemma |4. 101 guarantees some uniformity for the created instances, but not perfect 1-uniformity and only for the 
A^-small sets. Thus in Lemma 14.111 we need to define b in a slightly different way: we add some small terms to 
correct errors arising from the weaker uniformity and we truncate the function at large values (i.e., for sets that are not 
Ar c -small). 

Lemma 4.11. Let I = (V,D,C) be a CSP instance with hypergraph H such that \ sol(e)| < N for every e £E(H). If I is 
N c -consistent and (N,c, £ 3 ) -uniform for some c > 1 and £ := l/\V \, then the following function b is an edge-dominated, 
monotone, submodular function on V(H) with b(@) = 0: 



>(S) :- 



(1 - e)log N | sol (5) | + 2£ 2 |S| - £ 3 |S| 2 ifS is W-small, 
(1 -£)c + 2£ 2 |S| -£ 3 |S| 2 otherwise. 



Proof. Let h(S) := 2£ 2 |5| - £ 3 |,S| 2 . It is easy to see that h(S) is monotone and < h(S) < £ for every S C V(H) (as 
e\S\ < 1). Furthermore, h is a submodular function: 

h(X)+h(Y)-h(XDY)-h(XUY) = 2e 2 (\X\ + \Y\ -\XDY\- \X\JY\) + £ 3 (-|X| 2 - \Y\ 2 + \X DY\ 2 + |XU Y\ 2 ) 
= £ 3 (-(|XnF| + |X\F|) 2 -(|XnF| + |F\X|) 2 + |XnF| 2 + (|XnF| + |X\F| + |F\X|) 2 ) =2£ 3 |X\F|-|F\X| >0. 

This calculation shows that if |X\F|,|F\X| > 1, then we actually have h(X) + h(Y) > h(X n Y) + h(X U Y) + 2s 3 . 
We will use this extra 2£ 3 term to dominate the error terms arising from assuming only (Af,c,£ 3 )-uniformity instead of 
perfect uniformity. 

Let us first verify the monotonicity of b. If Y is A fC -small, then every X C Y is A^-small, which implies | sol(X)| < 
| sol(F)| as / is ^-consistent. Therefore, b(X) <b(Y) follows from the monotonicity of h. If Y is not Af c small, then 
b(Y) = (1 — £)c + h(Y) and ^(X) < b(Y) is clear for every X C Y, no matter whether X is ^-small or not. 

To see that b is edge-dominated, consider an edge e G E(H). By assumption, log^ | sol(e)| < 1 for every e G E(H) 
and hence (using Af c -consistency and c > 1) e is A^-small. Thus b(e) < (1 — e) +h(S) < 1, as required. 

Finally, let us verify the submodularity of b for some X,FCV. IfXCForFCX, then there is nothing to show. 
Thus we can assume that |X\F|,|F\X| > 1. We consider 3 cases depending on which of X and Y are A^-small. 
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Suppose first that X and F are both A^-small. In this case, 



b{X) + b(Y) = (1 - e)log N | sol(X)| + (1 - e)log N | sol(F)| + h(X) + h(Y) 

= (l-e)lo giV |sol(X)| + (l-e)log^ Msol(xnF)| • |J,ffy)| ) +h(X)+h(Y) 

> (1 -£)log w |sol(X)| + (1 -£)log w sol(XnF) + (1 -£)log w (max(F|XnF)/Af e3 ) +h(X) +h(Y) 
> (1 -e)log N \ sol(XnF)| + (l- e)log N (\ sol(X)|max(XUF|X)) - (1 -e) -£ 3 + /j(XnF) +/j(XUF) +2e 3 
> (1 - e)log N | sol(X n F) | + (1 - e)log N | sol(X UF)| +h(XDY)+h(XL)Y) > b(Xf~)Y) + b(XU F) 

(in the first inequality, we used the definition of (N,c, £ 3 )-uniformity on X n F and F; in the second inequality, we used 
the submodularity of h and Prop. 14.5( 2) for A = F, B = X n F, and C = X; in the third inequality, we used Prop. l4.5f 1) 
for A = X U F, B = X; the last inequality is strict only if X U F is not Af c -small). 

For the second case, suppose that, say, X is _/V c -small but F is not. In this case, X n F is Af c -small but X U F is not. 
Thus 

b(X)+b(Y) = (l-e)log N \sol(X)\ + (l-e)c+h(X)+h(Y)> (l-£)log A ,|sol(XnF)| + (l-e)c+/j(XnF)+/j(XUF) 

= &(xnF) + MxuF) 

(in the inequality, we used the jV c -consistency on X n F and F, and the submodularity of h). 

Finally, suppose that neither X nor F is Af c -small. In this case, X U F is not Af c -small either. Now 

b(X) + b(Y) = 2(1 -e)c + h(X)+h(Y) >2(\-s)c + h(XnY)+h(X\JY)>b(XnY)+b(X\JY). 

□ 

Having constructed the submodular function b as in Lemma l4.11[ we can use the argument described at the begin- 
ning of the section: if H has submodular width at most (1 — e)c, then there is a tree decomposition where every bag is 
Af c -small, and we can use this tree decomposition to find a solution. In fact, by Lemma l4~9l in this case /^-consistency 
implies that every nontrivial instance has a solution. 

Proof ( of Theorem \4. 1 D . Let / be an instance of CSP(%) having hypergraph H £7-1. We decide the solvability of / 
the following way. Let ,/V < ||/|| be the size of the largest constraint relation in /, i.e., every constraint has at most 
N satisfying assignments. Set e := l/\V(H)\, and let c := c$/(\ — e). Let us use the algorithm of Lemma 14.101 to 
produce the nontrivial ^-consistent (N,c,£ 3 ) -uniform instances I\, I,. The running time of this step is 2 2 ° <|v|) ' c / e • 
poly(||/||,Af c ), which is T^ nm . ||/[|0(co). 

If t = 0, then we can conclude that / has no solution. Otherwise, we argue that / has a solution. Consider any /, 
and let b be the edge-dominated monotone submodular function defined in Lemma |4.1 II By definition of submodular 
width, H has a tree decomposition (T, (B t ) t€ v(T)) sucn that b(B t ) < subw(//) < co = (1 — e)c for every t € V(T). 
Since b (5) < (1 — e)c implies | sol(5)| < N c and b is monotone, this means that B t is _/V c -small in /; for every t G V(T). 
Therefore, the conditions of Lemma |4!91 hold, and / has a solution. □ 



5 From submodular functions to highly connected sets 

The aim of this section is to show that if a hypergraph H has large submodular width, then there is a large highly 
connected set in H. Recall that we say that a set W is (ju, A) -connected for some fractional independent set /I and A > 0, 
if for every disjoint A, B C W, every fractional (A, B) -separator has weight at least X •min{/i(A),/x(B)} (see Section©. 
Equivalently, we can say that for every disjoint A,B C W, there is a (A, B) -flow of value A • min{/i(A),/x(B)}. 
The main result of this section allows us to identify a highly connected set if submodular width is large: 

Theorem 5.1. For every sufficiently small constant A > 0, the following holds. Let b be an edge-dominated monotone 
submodular function ofH with b(%) = 0. If the b-width ofH is greater than |(w + 1), then con^ (W) > w. 



17 



For the proof of Theorem [5J7J we need to show that if there is no tree decomposition where b(B) is small for every 
bag B, then a highly connected set exists. There is a standard recursive procedure that either builds a tree decomposition 
or finds a highly connected set (see e.g., [21 , Section 1 1.2]). Simplifying somewhat, the main idea is that if the graph 
can be decomposed into smaller graphs by splitting a certain set of vertices into two parts, then a tree decomposition 
for each part is constructed using the algorithm recursively, and the tree decompositions for the parts are joined in an 
appropriate way to obtain a tree decomposition for the original graph. On the other hand, if the set of vertices cannot 
be split, then we can conclude that it is highly connected. This high-level idea has been applied for various notions of 
tree decompositions B8ll46l l2l l47l . and it turns out to be useful in our context as well. However, we need to overcome 
two major difficulties: 

1. Highly connected set in our context is defined as not having certain fractional separators (i.e., weight assign- 
ments). However, if we want to build a tree decomposition in a recursive manner, we need integer separators 
(i.e., subsets of vertices) that decompose the hypergraph into smaller parts. 

2. Measuring the sizes of sets with a submodular function b can lead to problems, since the size of the union 
of two sets can be much smaller the sum of the sizes of the two sets. We need the property that, roughly 
speaking, removing a "large" part from a set makes it "much smaller." For example, if A and B are components 
of H \ S, and both b(A) and b(B) are large, then we need the property that both of them are much smaller than 
b(AUB). Adler [1] Section 4.2] investigates the relation between some notion of highly connected sets and 
/-width, but assumes that / is additive: if A and B do not touch, then f(A U B) = /(A) + f(B). However, for a 
submodular function b, there is no reason to assume that additivity holds: for example, it very well may be that 
b(A) = b(B)=b(AUB). 

To overcome the first difficulty, we have to understand what fractional separation really means. The first question is 
whether fractional separation is equivalent to some notion of integral separation, perhaps up to constant factors. The 
first, naive, question is whether a fractional (X, Y) -separator of weight w implies that there are 0(w) edges whose union 
is an (X,Y) -separator, i.e., there is a (X, Y) -separator S with Ph{S) = 0(w). There is a simple counterexample showing 
that this is not true. It is well-known that for every integer k > there is a hypergraph H such that p*(H) = 2 and 
p(H) = k. Let V be the set of vertices of H and let H' be obtained from H by extending it with two independent sets 
X, Y, each of size k, and connecting every vertex of X U Y with every vertex of V. It is clear that there is a fractional 
(X,y)-separator of weight 2, but every (X,Y) -separator S has to fully contain at least one of X, Y, or V, implying 
PH'(S)>L 

A less naive question is whether a fractional (X,Y) -separator with weight w in H implies that there exists an (X,Y)- 
separator S with p#(S) = 0(w) (or at most f(w) for some function /). It can be shown that this is not true either: using 
the hypergraph family presented in B4l Section 5], one can construct counterexamples where the minimum weight of 
a fractional (X,Y) -separator is a constant, but p^(S) has to be arbitrarily large for every (X,y)-separator S (we omit 
the details). 

We will characterize fractional separation in a very different way. We show that if there is a fractional (A,B)- 
separator of weight w, then there is an (A, B) -separator S with b(S) = 0(yv) for every edge-dominated monotone 
submodular function b. Note that this separator S can be different for different functions b, so we are not claiming 
that there is a single (A,fi)-separator S that is small in every b. The converse is also true, thus this gives a novel 
characterization of fractional separation, tight up to a constant factor. This result is the key idea that allows us to move 
from the domain of submodular functions to the domain of pure hypergraph properties: if there is no (A, B) -separator 
such that b(S) is small, then we know that there is no small fractional (A, B) -separator, which is a property of the 
hypergraph H only and has no longer anything to do with the submodular function b. 

To overcome the second difficulty, we introduce a transformation that turns a monotone submodular function b 
on V(H) into a function b* that encodes somehow the neighborhood structure of H as well. The new function b* 
is no longer monotone and submodular, but it has a number of remarkable properties, for example, b* remains edge 
dominated and b* (S) > b(S) for every set S C V(H), implying that ft*-widfh is not smaller than b- width. The main idea 
is to prove Theorem 15.1 I for b* -width instead of b- width. Because of the way b* encodes the neighborhoods, the second 
difficulty will disappear: for example, it will be true that b*(A US) = b* (A) +b*(B) if there are no edges between A 
and B, that is, b* is additive on disjoint components. Lemma 1531 formulates (in a somewhat technical way) the exact 
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property of b* that we will need. Furthermore, luckily it turns out that the result mentioned in the previous paragraph 
remains true with b replaced by b*: if there is a fractional (A, B) -separator of weight w, then there is an (A, B) -separator 
S such that not only b(S), but even b* (S) is 0(w). 

5.1 The function b* 

We define the function b* the following way. Let H be a hypergraph and let b be a monotone submodular function 
defined on V(H). Let Sy(H) be the set of all permutations of V(H). For a permutation n G Sytm, let N K (v) be the 
neighbors of v preceding v in the ordering %. For n G S v m) an ^ Z C V(H), we define 

^(v):=6(vU(A^(v)nZ))-6(At(v)nZ). 

In other words, db nZ (y) is the marginal value of v with respect to the set of its neighbors in Z preceding it. We abbrevi- 
ate db n y<j{\ by db n . As usual, we extend the definition to subsets by letting db n , z (S) := Eve5^^,z( v )- Furthermore, 
we define 

b n {Z) :=db n z(Z) = Y J db n , z {v), 

vez 

fc*(Z) := min b n {Z). 

JteS vtH) 

Thus 6 W (Z) is the sum of the marginal values with respect to a given ordering, while b* (Z) is the smallest possible sum 
taken over all possible orderings. Let us prove some simple properties of the function b*. Properties (l)-(3) and their 
proofs show why b* was defined this way, the other properties are only technical statements that we will need later. 

Proposition 5.2. Let H be a hypergraph and let b be a monotone submodular function defined on V(H) with b(Q>) = 0. 
For every K G SV(ff) an d Z C V(H) we have 

1. b n {Z)>b(Z), 

2. b*(Z)>b(Z), 

3. b n (Z) = b(Z) ifZ is a clique, 

4. db Jl , Zl (v)<db Jl , Z2 (v)ifZ 2 CZ h 

5. db n {v) < db n , z {v), 

6. b*(XUY) <b*{X)+b*{Y). 

Proof. (1) We prove the statement by induction on |Z|; for Z = 0, the claim is true (as b(fb) = 0). Otherwise, let v be 
the last element of Z according to the ordering %. As v is not preceding any element of Z, for every u G Z we have 
N K (u)nZ = N K (u) n (Z\v), and hence db-n^iu) = db nZ \ v (u). 

bn{Z)= db 7l: z{u) + db 7l:Z {v)= £ db nZ \ v (u) + db n p(v) 

uez\v uez\v 

= b n {Z\v) + db K , z (v)>b(Z\v) + b(vU {N~ (v) n Z)) - b(N~ (v) n Z) > b(Z) . 

In the first inequality, we used the induction hypothesis and the definition of db nZ {v); in the second inequality, we 
used the submodularity of b: the marginal value of v with respect to Z \ v is not greater than with respect to (v) n Z. 

(2) Follows immediately from (1) and from the definition of b*. 

(3) We prove the statement by induction on |Z|. As in (1), let v be the last vertex of Z in n. Note that since Z is a 
clique, N~ (v) n Z is exactly Z \ v. 

b n (Z)= £ db n , z (u) + db 7[ , z (v)= £ db xAv (u) + b(vu(N^(v)nz))-b(N^(v)nz) 

«gz\v uez\v 

= b 7l (Z\v)+b(vU(Z\v))-b(Z\v)=b(Z\v)+b(Z)- b(Z \ v) = b(Z) . 
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(4) Follows from the submodularity of b: db n ^ x (v) is the marginal value of v with respect to N^(v) C\Z\, while 
db K z 2 (v) is the marginal value of v with respect to the subset N K (v)C\Z2 of N n {v)C\Z\. 

(5) Immediate from (4). 

(6) Let Tlx be an ordering such that b nx {X) = b*{X) and define 7iy similarly. Let us define ordering n such that it 
starts with the elements of X, in the order of Tlx, followed by the elements of Y \X, in the order of 7Zy, and completed by 
an arbitrary ordering of V (H) \ (X LlY). It is clear that for every v G X, we have dbxjcuY ( v ) = <^%,x (v). Furthermore, 
for every v G Y \X, N^ y {v) flFC N^{v) n (XU7): if m is a neighbor of v in 7 that precedes it in %, then u is either 
in X or in F \X; in both cases m precedes v in 7T. Thus, similarly to (4), we have db n xuY{v) < db ny j{v) for every 
v G Y \X: dbjijuriv) is the marginal value of v with respect to N n (v)n (XU7), while 5^ % y(v) is the marginal value 
of v with respect to (v) n Y. Now we have 

b*(XUY)<b Jl (XUY) = £ db KjX uY(v)< £afc t; (v)+ £ a^(v)<fc*(x) + 6*(y). 

veXUY vex vgf\x 

□ 

Prop. l5.2l3 ) implies that 5fe W) z can be used to define a fractional independent set: 

Lemma 5.3. Let H be a hypergraph and let b be a monotone submodular function defined on V(H). Let W C V(H) 
and let % be an ordering ofW. Let us define }X (v) = db % yj (v) for v G W and ;U(v) =0 otherwise. Then }X is a fractional 
independent set ofH with fl(W) = b n {W) > b*(W). 

Proof. Let e be an edge of H and let Z := e n W. We have 

= M(Z) = db K)W {Z) < db^ziZ) = b n (Z) = b(Z) < 1, 

where the fist inequality follows from Prop. l5.2f 4). the last equality follows from Prop. 15.21' 3). and the second inequality 
follows from the fact that b is edge dominated. Furthermore, we have n(W) = db n ,w(W) = b K (W) > b(W) from 
Prop. □ 

We close this section by proving the main property of b* that allows us to avoid the second difficulty described 
at the beginning of Section [5J First, although it is not used directly, let us state that b* is additive on sets that are 
independent from each other: 

Lemma 5.4. Let H be a hypergraph, let b be an edge-dominated monotone submodular function defined on V(H) 
with b(Q>) = 0, and let A,B C V(H) be disjoint sets such that there is no edge intersecting both A and B. Then 
b*(AUB) = b*(A)+b*(B). 

Proof. By Prop. 15.21 6). we have to show only b* (A US) > b* (A) + b* (B). Let K be an ordering of V(H) such that 
b 7[ (AUB) = b*(AUB); we can assume that % starts with the vertices of AUB. Since there is no edge that intersects both 
A and B, and no vertex outside AUB precedes a vertex u G A U B, we have iV~ (u) C A for every u G A and iV~ (u) C B 
for every u *E B. Thus dbjij^B (u) = dbj[^(u) for every m G A and dbnAUBiu) = db K ^(u) for every u£6. Therefore, 
ft* (A US) = ft^(A US) = fcfc(A) + ft w (S) > b*(A) + b* (S), what we had to show. □ 

The actual statement that we use is more complicated than Lemma 15.41 there can be edges between A and S, but 
we assume that there is a small (A,B)-separator. We want to generalize the following trivial statement to our setting: 

Proposition 5.5. Let G be a graph, W C V(G) a set of vertices, A,BCff two disjoint subsets, and an (A,B)-separator 
S. If\S\ < \A\,\B\, then (C (~)W)U S < \W\ for every component C of G\S. 

The proof of Prop. 15-5 l is easy to see: every component C of G \ S is disjoint from either A or S, thus \C n W \ is at 
most \W\ — min{|A|, |S|} < \W\ — \S\, implying that |(CnW) US\ is less than \W\. In our setting, we want to measure 
the size of the sets using the function b* , not by the number of vertices. More precisely, we measure the size of S and 
(CflW) US using b* , while the size of W, A, and S are measured using the fractional independent set pL defined by 
Lemma [531 The reason for this will be apparent in the proof of Lemma l5.10l we want to claim that if such a separator 
S does not exist for any A,S C W, then W is a (/i, A)-connected set for this fractional independent set ji.. 
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Lemma 5.6. Let H be a hypergraph, let b be a monotone submodular function defined on V(H) with b(d)) = and let 
W be a set of vertices. Let Kw be an ordering ofV(H), and let ju(v) := db nWt w(v) for v G W and jJ,(v) = otherwise. 
LetA,B QW be two disjoint sets, and let S be an (A,B)-separator. Ifb*(S) < n(A),ii{B), then b*((Cn W) US) < n(W) 
for every component CofH\S. 

Proof. Let C be a component of H \ S and let Z := (C n W) U S. Let % be the ordering reaching the minimum in 
the definition of b*(S). Let us define the ordering n that starts with S in the order of %, followed by Cfl W in the 
order of 7%, and finished by an arbitrary ordering of the remaining vertices. It is clear that for every v G S, we have 
db K} z(v) = db Ks ^(v). Let us consider a vertex v G CPiW and let u G W be a neighbor of v that precedes it in 7%- Since 
v G C and C is a component of H \ S, either u G S or u G C n W. In both cases, w precedes v in 7T. This means that 
Nk w ( v ) n W - Njt ( v ) n 2' which implies that d&^z (v) < ^ (v) = H (v) for every v GCHW. As S separates A and 
B, component C intersects at most one of A and B; suppose, without loss of generality, that C is disjoint from A. Thus 

b*(Z) < b K {Z) = £a^, z (v) + £ db^v) < b*(S) + ju(C n W) < ju(A) + m(W \ A) = 

ves vecniv 

□ 

5.2 Submodular separation 

This section is devoted to understanding what fractional separation means: we show that having a small fractional 
(A, B) -separator is essentially equivalent to the property that for every edge-dominated submodular function b, there 
is an (A, B) -separator S such that b (5) is small. The proof is based on a standard trick that is often used for rounding 
fractional solutions for separation problems: we define a distance function and show by an averaging argument that 
cutting at some distance t gives a small separator. However, in our setting, we need significant new ideas to make 
this trick work: the main difficulty is that the cost function b is defined on subsets of vertices and is not a modular 
function defined by the cost of vertices. To overcome this problem, we use the definitions in Section [57TI (in particular, 
the function db n (v)) to assign a cost to every single vertex. 

Theorem 5.7. Let H be a hypergraph, X,Y Q V(H) two sets of vertices, and b : V(H) — > M + an edge-dominated 
monotone submodular function with b{%) = 0. Suppose that s is a fractional (X ,Y)-separator of weight at most w. 
Then there is an (X ,Y)-separator S C V(H) with b*(S) = 0(w). 

Proof. Let us define x(y) := max{l,£ eeE (#) vee s(e)}. It is clear that if P is a path from X to Y , then £ vg px(v) > 1. 
We define the distance d(v) to be the minimum of Y,v'eP x ( v ')> taken over all paths from X to v (this means that 
d(v) > is possible for some v G X). It is clear that d(v) > 1 for every v G Y. Let us associate the closed interval 
l(v) = [d(v) —x(v),d(v)] to each vertex v. If v is in X, then the left endpoint of i(v) is 0, while if v is in Y, then the 
right endpoint of i (v) is at least 1 . 

Let u and v be two adjacent vertices in H such that d(u) < d(y). It is easy to see that d(y) < d(u) +x(u): there 
is a path P from X to u such that Y,u'ep x ( u ') = d(u), thus the path P' obtained by appending v to P has Y,v'eP' x ( v ') = 
Y,u'eP x ( u ') + x ( v ) = d{u) +x(v). Therefore, we have: 

Claim 1: If m and v are adjacent, then i{u) n l(v) / 0. 

The class of a vertex v G V(H) is the largest integer fc(v) such that x(y) < 2~ K ( V \ and we define k(v) := °° if 
x(v) = 0. Recall that x(v) < 1, thus fc(v) is nonnegative. The offset of a vertex v is the unique value < a < 2 ■ 2~ K ^ 
such that d(v) = i(2 ■ 2~ K ^) + a for some integer i. Let us define an ordering % = (vi, . . . ,v„) of V(.ff) such that 

• ?c(v) is nondecreasing, 

• among vertices having the same class, the offset is nondecreasing. 

Let directed graph D be the orientation of the primal graph of H such that if v, and Vj are adjacent and i < j, then 
there is a directed edge v,-vy in D. If P is a directed path in D, then the width of P is the length of the interval \J veP l (v) 
(note that by Claim 1, this union is indeed an interval). The following claim bounds the maximum possible width of a 
directed path: 
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Claim 2: If P is a directed path D starting at v, then the width of P is at most \6x(v). 

We first prove that if every vertex of P has the same class fe(v), then the width of P is at most 4 • 2~ K ( V \ Since the 
class is nondecreasing along the path, we can partition the path into subpaths such that every vertex in a subpath has 
the same class and the classes are distinct on the different subpaths. The width of P is at most the sum of the widths of 
the subpaths, which is at most £,> K ( V ) 4 • 2~' = 8 • 2~ K ^ < I6x(v). 

Suppose now that every vertex of P has the same class fc(v) as the first vertex v and let h := 2~ K ( V \ As the offset is 
nondecreasing, path P can be partitioned into two parts: a subpath Pi containing vertices with offset at least h, followed 
by a subpath P2 containing vertices with offset less than h (one of Pi and P2 can be empty). We show that each of Pi 
and Pj has width at most 2h, which implies that the width of P is at most Ah. Observe that if u G Pi and i(u) contains 
a point i • 2h for some integer i, then, considering x(u) < h and the bounds on the offset of u, this is only possible if 
l (u) = [i ■ 2h, i-2h + h], i.e., i ■ 2h is the left endpoint of 1 (u). Thus if I\ = [j ue p { l (u) contains i • 2h, then it is the left 
endpoint of I\ . Therefore, I\ can contain i ■ 2h for at most one value of i, which immediately implies that the length of 
I\ is at most 2h. 

We argue similarly for P2. If u G P2, then i(u) can contain the point i-2h + h only if i(u) = [i -2h + h, (i + 1) • 2h]. 
Thus if I2 = Ukgp 2 1 ( M ) conta i ns i • 2/i + h, then it is the left endpoint of h- We get that I2 can contain i-2h + h for at 
most one value of i, which immediately implies that the width of I2 is at most 2h. This concludes the proof of Claim 2. 

Let c(v) := db n (v). 

Claim 3: Y,vev(H) x { v )c{ v ) < w. 

Let us examine the contribution of an edge e G E(H ) with value s(e) to the sum. For every vertex v G e, edge e increases 
the value x(v) by at most s(e). Thus the total contribution of edge e is at most 

s{e) ■ £ c(v) = s{e) ■ £ db n {v) < s(e) ■ £ db n , e {v) = s{e)b n {e) = s{e)b(e) < s(e), 

vGe vGe v€e 

where the first inequality follows Prop. I5.2t 5): the last equality follows form Prop. l5.2l3 ): the last inequality follows 
from the fact that b is edge dominated. Therefore, Lvey(//)- x ( v ) c ( v ) ^ Eee£(ff) s ( e ) — w > proving Claim 3. 

Let S be a set of vertices. We define S to be the set of all vertices from which a vertex of S is reachable on a directed 
path in D (in particular, this means that S C S). 

Claim 4: For every S C V(H), E^cfv) = b K {S). 

Observe that for any v G S, every inneighbor of v is also in S, hence N% (v) C S. Therefore, db n ^{v) = db 7t (v) =c(v) 
and the claim follows. 

Let S(t) be the set of all vertices v G V(£f) for which t G l(v). Observe that for every < t < 1, the set 5(f) (and 
hence 5(f)) separates X from F. We use an averaging argument to show that there is a < t < 1 for which b K (S(t)) is 
O(w), As b*(S(t)) < b K (S(t)), the set 5(?) satisfies the requirement of the lemma. 

If we are able to show that Jq 1 b n (S(t))dt = 0(w), then the existence of the required t clearly follows. Let I v (t) = 1 
if v G S(t) and let I v (t) = otherwise. If I v {t) = 1, then there is a path P in D from v to a member of S(t). By Claim 2, 
the width of this path is at most I6x(v), thus t G [d(v) — I6x(v),d(v) + 15x(v)]. Therefore, Jq I v (t)dt < 31jc(v). Now 
we have 

f b K {S{t))dt = f Y J c(v)dt=f £ c(v)I v {t)dt= £ c(v) f I v (t)dt< 31 £ x(v)c(v)<31w 

(we used Claim 4 in the first equality and Claim 3 in the last inequality). □ 

Although it is not used in this paper, we can prove the converse of Theorem l5.7l in a very simple way. 

Theorem 5.8. Let H be a hypergraph, and let X,Y C V(H) be two sets of vertices. Suppose that for every edge- 
dominated monotone submodular function on H with b{%) = 0, there is an (X ,Y)-separator 5 with b(S) < w. Then 
there is a fractional (X,Y)-separator of weight at most w. 
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Proof. If there is no fractional (X,Y) -separator of weight at most w, then by LP duality, there is an (X,F)-fiow F of 
value greater than w. Let b(S) be defined as the total weight of the paths in F intersecting S; it is easy to see that / 
is a monotone submodular function, and since F is a flow, b[e) < 1 for every e 6 E(H). Thus by assumption, there 
is an (X,Y) -separator S with &(S) < w. However, every X — Y path of F intersects (X,y)-separator S, which implies 
b(S) > w, a contradiction. □ 

We close this section by pointing out that finding an (A, B) -separator S with b(S) small for a given submodular 
function b is not an instance of submodular function minimization, and hence the well-known algorithms (see lT3"6ll37l 
I521 ) cannot be used for this problem. If a submodular function g(X) describes the weight of the boundary of X, then 
finding a small (A, B) -separator is equivalent to minimizing g(X) subject to A C X, XdB = 0, which can be expressed 
as an instance of submodular function minimization (and hence solvable in polynomial time). In our case, however, 
b(S) is the weight of S itself, which means that we have to minimize g(S) subject to 5 being an (A, B) -separator and this 
latter constraint cannot be expressed in the framework of submodular function minimization. A possible workaround 
is to define 8(X) as the neighborhood of X (the set of vertices outside X adjacent to X) and b'(X) := b(8(S)); now 
minimizing b'(X) subject to A C X U 8 (X), X n B = is the same as finding an (X, Y) -separator S minimizing b(S). 
However, the function b' is not necessarily a submodular function in general. Therefore, transforming b to b' this way 
does not lead to a polynomial-time algorithm using submodular function minimization. In fact, it is quite easy to show 
that finding an (A, B) -separator S with b (S) minimum possible can be an NP-hard problem even if b is a submodular 
function of very simple form. 

Theorem 5.9. Given a graph G, subsets of vertices X, Y, and collection S of subsets of vertices, it is NP-hard to find 
an (X ,Y)-separator that intersects the minimum number of members ofS. 

Proof. The proof is by reduction from 3-COLORING. Let H be a graph with n vertices and m edges; we identify the 
vertices of H with the integers from 1 to n. We construct a graph G consisting of 3n + 2 vertices, vertex sets X, Y, and 
a collection S of 6m sets such that there is an (X, Y) -separator 5 intersecting at most 3m members of S if and only if G 
is 3-colorable. 

The graph G consists of two vertices x, y, and for every 1 < i < n, a path xv^iVi^y^y of length 4 connecting x and 
y. The collection S is constructed such that for every edge ij € E(H) and 1 < a,b < 3, a ^ b, there is a corresponding 
set {vi : a,vjfi,x,y}- Let X := {x} and Y := {y}. Observe that the set {v,-. a ,v ; -./,} intersects exactly 3 sets of S if a ^ b 
and exactly 4 sets of S if a = b. 

Let c : V(G) — > {1,2,3} be a 3-coloring of G. The set S = {v^ c a\ | 1 < i < n} is clearly an (X,Y) -separator. For 
every ij 6 E(G), separator S intersects only 3 of the 6 sets {vi, a ,Vi,b,x,y}. Therefore, S intersects exactly 3m members 
of S. 

Consider now an (X,Y) -separator S intersecting at most 3m members of S. Since every member of S contains both 
x and y, it follows that x, y S. Thus S has to contain at least one internal vertex of every path xv^iVi^Vi^y. For every 
1 < i < n, let us fix a vertex v i c ^ G S. We claim that c is a 3-coloring of G. For every ij 6 E(G), S intersects at least 
3 of the sets {v^v,-^ ,#,)>}, and intersects 4 of them if c(i) = c(j). Thus the assumption that 5 intersects at most 3m 
members of S immediately implies that c is a proper 3-coloring. □ 

5.3 Obtaining a highly connected set 

The following lemma is the same as the main result of Section [5] (Theorem 15.11 ) with the exception that fe-width is 
replaced by £*-width. By Prop|52t2), b* (S) > b(S) for every set S C V(H), thus Z?*-width is not less than fc-width. 
Therefore, the following lemma immediately implies Theorem l5.ll 

Lemma 5.10. Let b be an edge-dominated monotone submodular function ofH with b(d)) = 0. If the b* -width ofH is 
greater than | (w + 1), then con^ (W) > w (for some universal constant X ). 

Proof. Let A := l/c, where c is the universal constant of Lemma \5J] hidden by the big-0 notation. Suppose that 
con^(W) < w, that is, there is no fractional independent set /I and (ju, A) -connected set W with n(W) >w. We show 
that H has a tree decomposition of b* -width at most |(w+ 1), or more precisely, we show the following stronger 
statement: 
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For every subhypergraph H' of H and every W C V(H') with ft* (W) < w + 1, there is a tree decomposition 
of //' having b* -width at most |(w + 1) such that W is contained in one of the bags. 

We prove this statement by induction on | V (H 1 ) \. \fb* (V (H')) < | (w + 1 ) , then a decomposition consisting of a single 
bag proves the statement. Otherwise, let W' be an inclusionwise maximal superset of W such that w < b*(W') < w + l. 
Observe that there has to be at least one such set: from the fact that b* (v) < 1 for every vertex v and from Prop. 15.21 6). 
we know that adding a vertex increases b* (W') by at most 1. Since b* (V(H')) > §(w + 1), by adding vertices to W in 
an arbitrary order, we eventually find a set W' with b* (W') > w, and the first such set satisfies b* (W) < w+l as well. 

Let % be an ordering of V(H') such that b K (W') = b*(W'). As in Lemma l531 let us define the fractional independent 
set n by n(v) := db KW i(v) if v G W and /j,(v) = otherwise. Clearly, we have n(W) = b*(W') > w. 

By assumption, W' is not (/I , A )-connected, hence there are disjoint sets A , B C W' and a fractional (A, B) -separator 
of weight less than X ■ mm{ix(A) , ix(B)} . Thus by Lemma 15771 there is an (A, B) -separator S C V(H') with ft*(S) < 
min{/i(A),/x(B)} < piiW') /2 < (w + l)/2 (the second inequality follows from the fact that A and B are disjoint subsets 
of W). Let Ci, . . . , C r be the connected components of H' \ S; by LemmaEU b*((Q n W) U 5) < = ft* (W) < 

w + 1 for every 1 < i < r. As b*(V(H')) > 1) and ft* (S) < (w+ l)/2, it is not possible that 5 = V(H'), hence 

r > 0. It is not possible that r = 1 either: (Ci n W) U 5 would be a superset of W' with b* -value less than w+l, 
contradicting the maximality of W'. Thus r > 2, which means that each hypergraph H\ := H'[CiUS] has strictly fewer 
vertices than H'. 

By the induction hypothesis, each H\ has a tree decomposition Ti having b* -width at most |(w + 1) such that 
Wi := (Qn W) U5 is contained in one of the bags. Let B, be the bag of 77 containing W,-. We build a tree decomposition 
T of H by joining together the tree decompositions T\, \&t Bq :=W VJSbe. a new bag that is adjacent to bags 

B\, B r . It can be easily verified that T is indeed a tree decomposition of H' . Furthermore, by Prop. I5.2f 6). 
b* (B ) < b* (W) + b* (S) < w + 1 + (w + 1 )/2 = | (w + 1 ) and by the assumptions oaT\, every other bag has 

ft* value at most | ( w + 1 ) . □ 

6 From highly connected sets to embeddings 

The main result of this section is showing that the existence of highly connected sets imply that the hypergraph has 
large embedding power. Recall that W is a (fi, X) -connected set for some X > and fractional independent set /I 
if for every disjoint X,Y C W, the minimum weight of a fractional (X,Y) -separator is at least X ■ {jJ.(X),n(Y)} (or 
equivalently, there is an (X,7)-flow of value X • {/^(X),/x(F)}). Recall also that con^(//) denotes the maximum value 
of jJ.(W) taken over every fractional independent set jU and (/x, X) -connected set W. 

Theorem 6.1. For every sufficiently small X > and hypergraph H, there is a constant m H x such that every graph 

3 1 

G with m > rrijix edges has an embedding into H with edge depth 0{mj \X^ con^(//)4)). Furthermore, there is an 
algorithm that, given G and H, produces such an embedding in time f(H,X)n°^\ 

In other words, Theorem 16.11 gives a lower bound on the embedding power of H: 

Corollary 6.2. For every sufficiently small X > and hypergraph H, emb (H) = Q(X 5 con^ (H)*). 

Theorem l6.1l is stated in algorithmic form, since the reduction in the hardness result of Section [7] needs to find such 
embeddings. For the proof, our strategy is similar to the embedding result of 11421 : we show that a highly connected 
set implies that a uniform concurrent flow exists, the paths appearing in the uniform concurrent flow can be used to 
embed (a blowup of) the line graph of a complete graph, and every graph has an appropriate embedding in the line 
graph of a complete graph. To make this strategy work, we need generalizations of concurrent flows, multicuts, and 
multicommodity flows in our hypergraph setting and we need to obtain results that connect these concepts to highly 
connected sets. Some of these results are similar in spirit to the O(^Jn) -approximation algorithms appearing in the 
combinatorial optimization literature |[30l[3Tl [3l. However, those approximation algorithms are mostly based on clever 
rounding of fractional solutions, while in our setting rounding is not an option: as discussed in Section 5, the existence 
of a fractional (X,Y) -separator of small weight does not imply the existence of a small integer separator. Thus we have 
to work directly with the fractional solution and use the properties of the highly connected set. 
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It turns out that the right notion of uniform concurrent flow for our purposes is a collection of flows that connect 
cliques: that is, a collection (1 < i < j < k) of compatible flows, each of value £, such that F (J is a (Ki,Kj)-fto\si, 
where K\, . . . , Kt are disjoint cliques. Thus our first goal is to find a highly connected set that can be partitioned into k 
cliques in an appropriate way. 

6.1 Highly connected sets with cliques 

Let (Xi,Y\), {Xk^Yk) be pairs of vertex sets such that the minimum weight of a fractional (X,-,F,) -separator is s,-. 
Analogously to multicut problems in combinatorial optimization, we investigate weight assignments that simultane- 
ously separate all these pairs. Clearly, the minimum weight of such an assignment is at least the minimum of the 
s^s and at most the sum of the s,'s. The following lemma shows that in a highly connected set, such a simultaneous 
separator cannot be very efficient: roughly speaking, its weight is at least the square root of the sum of the s,'s. 

Lemma 6.3. Let }X be a fractional independent set in hypergraph H and let W be a (jli,A) -connected set for some 
< A < 1. Let (Xi,...,X k ,Yi,...,Y k ) be a partition ofW, let w, := rnin{jU(X,-),jU (?;•)} > 1/2, and let w := £f =1 w,\ 
Let s : E(H) — > W + be a weight assignment of total weight p such that s is a fractional (Xi,Yi)-separator for every 
l<i<k. Then p > (A/7) • 0£ 

Proof. Let us define the function s' by s'{e) = 6s(e) and let x(v) := Y<eeE(H).vee s '( e )- We define the distance d(u,v) 
to be the minimum of Y<v'eP x ( v ')> taken over all paths P from u to v. It is clear that the triangle inequality holds, i.e., 
d(u,v) <d(u,z)+d(z,v) for every u,v,zEV(H). If s covers every u — v path, then d(u,v) > 6: every edge e intersecting 
a u — v path P contributes at least s'(e) to the sum £ v / e />x(v') ( as e can intersect P in more than one vertices, e can 
increase the sum by more than s'{e)). On the other hand, if d(u, v) > 2, then s' covers every u — v path. Clearly, it is 
sufficient to verify this for minimal paths. Such a path P can intersect an edge e at most twice, hence e contributes at 
most 2s' (e) to the sum Lv'gp x ( v ') ^ 2, implying that the edges intersecting P have total weight at least 1 in s' . 

Suppose for contradiction that p < (A/7) • y/w, that is, w > 49p 2 /X 2 . Let A := and B := ULipQUY,-). Note 
that jU(fi) > 2£* =1 w; = 2w. We will increase A and decrease B while maintaining the invariant condition that the 
distance of A and B is at least 2 in d. Let T be the smallest integer such that Yj=\ w i > 6/?/ A; if there is no such T, then 
w < 6p/X, a contradiction. As w; > 1/2 for every i, it follows that T < \2p/X + 1 < 13^/A (since p > 1 and A < 1). 

For i = 1,2, ... , T, we perform the following step. Let X\ (resp., Y[) be the set of all vertices of W that are at 
distance at most 2 from Xj (resp., Y,). As the distance of X; and Yi is at least 6, by the triangle inequality the distance of 
X[ and Y[ is at least 2, hence s' is a fractional (X/, F/)-separator. Since W is (/x, A) -connected and s' is an assignment of 
weight 6p, we have mm{jj,(Xl),jj,(Y[)} < 6p/X. If n(X{) < 6p/X, then let us put Xj into A and let us remove X- from 
B. The set X-, which we remove from B, contains all the vertices that are at distance at most 2 from any new vertex in 
A, hence it remains true that the distance of A and B is at least 2. Similarly, if jJ.(X-) > 6p/X and n(Yf) < 6p/X, then 
let us put Yj into A and let us remove Y[ from B. Note that we may put a vertex into A even if it was removed from B in 
an earlier step. 

In the i-th step of the procedure, we increase /x(A) by at least w, (as ju(X,-),ju(}/) > and these sets are disjoint 
from the sets already contained in A) and jj,(B) is decreased by at most 6p/X. Thus at the end of the procedure, we 
have /x(A) > Yl=i w i > 6p/X and 

li{B) >2w-T- 6p/X > 9Sp 2 /(X 2 ) - (13/?/(A))(6/?/A) > 6p/X, 

that is, mm{jj,(A),jj,(B)} > 6p/X. By construction, the distance of A and B is at least 2, thus s' is a fractional (A,B)- 
separator of weight exactly 6p, contradicting the assumption that W is (ju, A) -connected. □ 

In the rest of the section, we need a more constrained notion of flow, where the endpoints "respect" a particular 
fractional independent set. Let jj.2 be fractional independent sets of hypergraph H and let X,Y C V(H) be two (not 
necessarily disjoint) sets of vertices. A (jUi, \l2)-demand (X,Y)-flow is an (X,F)-flow F such that for each the 
total weight of the paths in F having first endpoint x is at most fJ,\(x), and similarly, the total weight of the paths in 
F having second endpoint y is at most /^(y). Note that there is no bound on the weight of the paths going through 
an x e X, we only bound the paths whose first/second endpoint is x. The definition is particularly delicate if X and 
Y are not disjoint, in this case, a vertex z£XflF can be the first endpoint of some paths and the second endpoint of 
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some other paths, or it can be even both the first and second endpoint of a path of length 0. We use the abbreviation 
/X -demand for (/x,/x)-demand. 

The following lemma shows that if a flow connects a set U with a highly connected set W, then U is highly 
connected as well ("W can be moved to U"). This observation will be used in the proof of Lemma 16.51 where we 
locate cliques and show that their union is highly connected, since there is a flow that connects the cliques to a highly 
connected set. 



Lemma 6.4. Let H be a hypergraph, }i\ , \i2 fractional independent sets, and W C V(H) a (p,i,X)-connected set for 
some < A < 1. Suppose that U C V(H) is a set of vertices and F is a (jiii,^) -demand (W,U)-flow of value ^(U). 
Then U is (jU2, A / 6)-connected. 

Proof. Suppose that there are disjoint sets A, P C U and a fractional (A,P)-separator s of weight w < (A/6)-min{jU2(A),jU2(fi)] 
(Note that this means /12(A), H2 (P) > 6w/A > 6w.) For a path P, let s(P) = Eee£(H),enP/0 s ( e ) be the total weight of 
the edges intersecting P. Let A' C W (resp., B' C W) contain a vertex v G W if there is a path P in F with first endpoint v 
and second endpoint in A (resp., B) and s{P) < 1/3. If A'nP' 7^ 0, then it is clear that there is a path P with s(P) < 2/3 
connecting a vertex of A and a vertex of B via a vertex of A' PlP', a contradiction. Thus we can assume that A' and B' 
are disjoint. 

Since F is a flow, the total weight of the paths in F with s(P) > 1/3 is at most 3w. As the value of F is exactly 
LH(U), the total weight of the paths in F with second endpoint in A is exactly ll<i(A). If s(P) < 1/3 for such a path, 
then its first endpoint is in A' by definition. Therefore, the total weight of the paths in F with first endpoint in A' 
is at least /12(A) — 3w, which means that Hi (A') > /12(A) — 3w > jU2(A)/2. Similarly, we have jUi(fi') > 112(B) /2. 
Since W is (jUi, A) -connected and 5 is an assignment with weight less than (A/6) • min{jU2(A),jU2(fi)} < (A/3) • 
min{/ii (A 1 ) , jii (B 1 )}, there is an A' — B' path P with 5 (P) < 1/3. Now path P, together with an A' — A path P A having 
s{Pa) < 1 /3, and a.B' — B path Pg having s(Pg) < 1/3 forms an A — P path that is not covered by s, a contradiction. □ 

A /i -demand multicommodity flow between pairs (Ai,Bi), . . . , (A r ,B r ) is a set Pi, . . ., F r of compatible flows such 
that Fj is a /I -demand (A,,B,)-flow. The value of a multicommodity flow is the sum of the values of the r flows. Let 
A = UJ =1 A,-, B = (J[ =1 P,-, and suppose for simplicity that (Ai , . . . ,A r ,Pi , . . . ,B r ) is a partition of A U P. In this case, 
the maximum value of a /I -demand multicommodity flow between pairs (Ai,Pi), . . . , (A r ,P r ) can be expressed as the 
optimum values of the following primal and dual linear programs (we denote by V uv the set of all u — v paths): 
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maximize 



s. t. 



i=l u£Aj,vEBj 

PeT uv 



£ £ *(/»)< 1 VeeE(H) 

1=1 ueAj.veBj 
PeV IIV ,Pne=i$ 

x(P)<n(u) Vl<j<r,MGAi 
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Dual LP 

minimize £ y(g) + £ (i(u)y(u) + £ M(v)y(v) 

eee(ff) ueA veB 



s. t. 

£ y(e)+y(u)+y(v)>\ 

eeE(H), 
enP^H) 



VI < i < r,u G A,-, v G P,, 
P G P HV 



y{e)>0 VeeE(H) 
y(u)>0 VwGA 
y(v) > Vv G P 



The following lemma shows that if con^ (H) is sufficiently large, then there is a highly connected set that has the 
additional property that it is the union of k cliques K\, . . . , Kk with ju(J£i) > 1/2 for every clique. The high-level idea 
of the proof is the following. Take a (ju, A) -connected set W with n(W) = con^(//) and find a large multicommodity 
flow between some pairs (Ai,Pi), . . . , (A r ,B r ) in W. Consider the dual solution y. The edges and vertices with nonzero 
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value in y cover the multicommodity flow. If most of the weight of the dual solution is on the edge variables, then there 
is a small number of edges that intersect a large fraction of the flow. These edges are connected to W by a flow, and 
therefore by Lemma l6~4l fhe union of these edges is also highly connected and obviously can be partitioned into a small 
number cliques. 

There are two things that can go wrong with this argument. First, it can happen that the dual solution assigns 
most of the weight to the vertex variables y(u), y(v) (« 6 A, v 6 B). This case is only possible if the value of the 
dual (and hence the primal) solution is close to YH=\ (min{/t(A,-) + /t(B,-)}). To avoid this situation, we want to select 
the pairs (A,-,B,-) such that they are only "moderately connected": there is a fractional (A,,B,)-separator of weight 
2Amin{/t(A,-),/t(B,-)}, making the weight of the primal solution much less than L; = i(/t(A,) + /t(B,-)) (if A is small). 
If we are not able to find sufficiently many such pairs, then we argue that a large subset W' C W is (2/1, A) -connected 
and 2jii(W 7 ) > con^(//), a contradiction (here we have to make sure first that 2/t is also a fractional independent set). 

The second problem is that the value of the dual solution can be so small that find we find a very small set of edges 
that already cover a large fraction of the multicommodity flow. However, in this case we arrive to a contradiction by 
using Lemma [631 to show that W is not (/t, A) -connected. 

Lemma 6.5. Let H be a hypergraph and letO < A < 1/16 be a constant. Then there is fractional independent set /t, a 
(/t , X /6)-connected set W, and a partition (K\ , . . . , K^) of W such that k = Q.{X yxon^ff)), and for every 1 <i <k, 
Kj is a clique with }x{Ki) > 1 /2. 

Proof. Let k be the largest integer such that con^ (H) >3T + 2k holds, where T := (56/A) 2 -k 2 ; it is clear that k = 
£1(A -^/con^ (H)) . Let /to be a fractional independent set and W be a (/to , X ) -connected set with Ho(W) = con^ (H) . We 
can assume that /io(v) > if and only if v G W. This also implies that W is in one connected component of H. 

Highly loaded edges. First, we want to modify /to such that there is no edge e with Ho(e) > 1/2. Let us choose 
edges g\, g2, ... as long as possible with the requirement Ho(Gj) > 1/2 for G, '■= gi\\J'jZ}igj- If we can select at 
least k such edges, then the cliques K\, can be found in an easy way. In this case, let Kj := G,-f~l W, clearly 

W := ULi Gi Q W is a (/to, X) -connected set, Ho(Ki) > 1 /2, and {K\,. . . ,Kk) is a partition of W into cliques. 

Thus we can assume that the selection of the edges stops at edge g t for some t < k. Let Wo:=W\ \J\ = \ gi- Observe 
that there is no edge e G E(H) with /fo(enWo) > 1/2, as in this case the selection of the edges could be continued 
with g t+ \ := e. Thus if we define pt such that ju(v) = 2jUo(v) if v G Wq and /t(v) = otherwise, then /i is a fractional 
independent set. Note that ju(W ) = 2jUo(W \(J;=i£<) > 2 (Lk)( w ) ~ k )= 2 M W ) ~ 2k - 

Moderately connected pairs. The set Wo is (/io, X) -connected, but not necessarily (/i, X) -connected. In the next 
step, we find a collection of pairs (A,-,B ; ) that violate (/I, X) -connectivity. We repeat the following step for i = 1,2,... 
as long as possible. If there are disjoint subsets A,-,B, C W,_i such that there is a fractional (A^B,) -separator with value 
less than Aw,- for w,- := min{/t(A,-),/t (B,)}, then define Wi := Wj_i \ (A,- US,). Informally, we can say that these pairs 
(Ai,Bj) are "moderately connected": the minimum value of a fractional (A,, B,) -separator is less than Aw,-, but at least 
Aw,-/2 = Amin{/to(A,-),/to(B,-)} (because W is (/to, A) -connected). Note that every fractional separator has value at 
least 1 (as W is in a single component of H), thus Aw,- > 1 holds, implying w,- > 1/A > 1. In each step, we select 
A,- and B, such that |A,-| + |B,| is minimum possible. In particular, this implies that /t(A i ),/i(B,) < w, + 1 < 2w,-: if, 
say, /t(A,) > /t(B,) + 1, then removing an arbitrary vertex of A,- decreases /t(A,) by at most one (as /t is a fractional 
independent set) without changing min{/t(A,),/t(B,)}, hence there would be a smaller pair of sets with the required 
properties. Therefore, we have 2w,- < /t (A,- U B,) < 2w,- + 1 < 3w; for every 1 < i < r. 

Suppose that the procedure stops after finding the pairs (Ai,Bi), . . . , (A r ,B r ) for some r > 0. Suppose first that 
w ■= lf i=lWi < T. Then /t(|J- = i(A,-UB,-)) < 3w < 3T, hence /t(W r ) > n(W )-3T > 2/t (W) - 2k - 3T > juq(W) = 
con^(H). Since the procedure stopped, there is no fractional (A', B') -separator of value less than A •min{/t(A'),/t(B / )} 
for any disjoint A',B' C W r , that is, W r is (/t, A) -connected with fl(W r ) > con^(//), contradicting the definition of 
con^ (H). Thus in the following, we can assume that w > T. 

Finding a multicommodity flow. By construction, there is a fractional (A,, B,) -separator of value less than Aw,-, 
hence the maximum value of a /t -demand multicommodity flow between pairs (Ai,Bi), . . . , (A r ,B r ) is less than Aw. 
Let A := ULi^r an d B := U[ =1 B,. Let us consider an optimum dual solution with value Y = Y\ + Y%, where Y\ is the 
contribution of the variables y(u),y(v) (a £ A, I) £ B), and Y2 is the contribution of the variables y(e) (e G E{H)). Let 
A* := {u G A I y{u) < 1/4}, B* := {v G B | y(v) < 1/4}, A* =A,-nA*, B* =B,-nB*, and w* = min{/t(A*),/t (B*)}. 
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For each i, the value of w* is either at least w,-/2, or less than that. Assume without loss of generality that there is a 
1 < r* < r such that w* > w,-/2 if and only if i <r*. Let w* = w *i- 

We claim that w* > w/4. Note that w* < Wj/2 means that either ju(A*) < Wj/2 or ju(fi*) < w,/2; as ix(Ai),jx{Bj) > 
wu this is only possible if ju(A,- \A*) + ju(fi,- > W//2. Suppose first that LL r *+i w < > This would imply 

Al((A\A*)U(B\B*))> £ (m(MA*)+M(*A**))> I m/2 > w/4. 

i=r* + l i=r*+\ 

However, y(u) > 1/4 for every u G (A\A*)U (B\B*), thus F > Fi > /x((A \ A*) U {B \B*))/4 > w/16 > Aw (since 
A < 1/16), a contradiction with our earlier observation that the optimum is at most Aw. Thus we can assume that 
L!=r*+i w i — w/2 and hence Y!i=\ w i — w/2. Together with w* > w,/2 for every 1 < i < r* , this implies w* > w/4. 

As y («),)>(£) < 1/4 for every a G A*, b G 5*, it is clear that for every A- — 5* path P, the total weight of the edges 
intersecting P has to be at least 1/2 in assignment y. Therefore, if we define y* : E(H) — >• R + by y*(e) = 2y(e) for 
every e G F(//), l ^ en y* covers every A* — B* path. Let W* = UillC^i US*). We use Lemma [631 for the (jU,A)- 
connected set VP, the pairs (A*,B*), (A* t ,B* t ), and for the weight assignment y*. Note that w* > w,/2 > 1/2 for 
every i. It follows that the total weight of y* on the edges is at least (A/7) • y/w* > (A/14) • y/w, which means that 
Yz = LeeE(H)y(e)> (A/28) -v^> (A/28) -y/T>2k. 

Locating the cliques. Let us fix and optimum primal and dual solution for the maximum multicommodity flow 
problem with pairs (A\,B\), (A* t ,B* t ) and let be the sum of the flows obtained from the primal solution. We 
select k cliques K[, . ..,K^ and associate a subflow Fj of F^ with each clique Kj as follows. Let F^ be the flow obtained 
from f(°' by removing F\, . . . , F,-. For every u — v path P appearing inF^ ), we get Y*eeE(H),ec\P^®y{ e ) +y( u ) +y( v ) = 1 
from complementary slackness: if the primal variable corresponding to P is nonzero, then the corresponding dual 
constraint is tight. In particular, this means that the total weight of the edges intersecting such a path P is at most 1 . 
Let c(e,Fw) be the total weight of the paths in fW intersecting edge e and let C,- = Y*eeE(H)y{ e ) c { e '->F ) ■ Again by 
complementary slackness, c(e,F^) = 1 for each e G E(H) withy(e) > and hence Co = T,eeE(H)y( e ) ^ 

Let us select e,- to be an edge such that c(ei,F^ 1 ^) is maximum possible and let Kj := e, \ UyLi e j- Let the flow 
Fj contain all the paths of F^ -1 ) intersecting e,-. Observe that the paths appearing in F; do not intersect e\, 
(otherwise they would no longer be in F^ -1 )), thus clique Kj intersects every path in Fj. As F^ -1 ) is a subflow of 
F(°\ we have that for every path P in F('~ l \ the edges intersecting P have total weight at most 1 in y. This means 
that if we remove a path of weight y from F^ l \ then C,_i decreases by at most y. As the total weight of the paths 
intersecting e, is at most 1, we get that C,- > Q_i — 1 and hence Q > Co — k > Co/2 for i < k. Since Co = EeeE(ff) y(e) 
and C,- = Y, e eE(H)y( e ) c ( e iF^)> it i s eas Y to see tnat Q > Co/2 implies that there has to be at least one edge e with 
c(e,FW) > 1/2. Thus in each step, we can select an edge such that that the total weight of the paths in 
intersecting e, is at least 1/2, and hence the value of F; is at least 1/2 for every 1 < i < k. 

Moving the highly connected set. Let U = Uf=i^'- Each path P in F is a path with endpoints in W* and 
intersecting Kj. Let us truncate each path P in Pj such that its first endpoint is still in W* and its second endpoint is in 
Kf, let F[ be the (W,^)-flow obtained by truncating every path in Fj. Note that Fj is still a flow and the sum F' of F[, 
. . . , Fj. is a (W*, f/)-flow. Let pL\= pL and let ^(v) be the total weight of the paths in F' with second endpoint v. It is 
clear that is a fractional independent set, ^(Kj) > 1/2, and F is a (/Xi,/X2)-demand (W / *,?7)-flow with value ^(t/). 
Thus by Lemma l6~4l ?7 is a (jU2, A /6) -connected set with the required properties. □ 

6.2 Concurrent flows and embedding 

Let W be a set of vertices and let (Xi , . . . ,Xt) be a partition of W. A uniform concurrent flow of value £ on (Xi , . . . ,Xt) 
is a compatible set of (*) flows F,- ; (1 < i < j < k) where F;j is an (X,X ; )-flow of value £. The maximum value 
of a uniform concurrent flow on W can be expressed as the optimum values of the following primal and dual linear 
programs (we denote by Vjj the set of all X, — Xj paths): 
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Primal LP 

maximize e 
s. t. 

I I x(P)<l VeGE(H) 

l<i<j<kPeV t j, 

£ x(P) >e VI < i < j < k 

x(P)>0 VI < i < 7 <k,PeVij 



Dual LP 

minimize V" y(e) 
£ y(e)>4j Vl<i<j<k,PeVi,j 

eeE{H),enP^D 

l<i'<7<* 

y(e) > Ve G 
kj > VI < i < j < k 



If // is connected, then the maximum value of a uniform concurrent flow on (X\ ,.. . ,Xk) is at least l/( 2 ) = 2 ): 
if each of the (*) flows has value l/( 2 ), then they are clearly compatible. The following lemma shows that in a (jU, A)- 
connected set, if the sets Xi, . . . , X& are cliques and ju(X,-) > 1/2 for every i, then we can guarantee a better bound of 

G(Jfc-f). 



Lemma 6.6. Le? H be a hypergraph, }X a fractional independent set ofH, and W C a (jU, A) -connected set for 

some < A < 1. Le? (Xi , . . . , X#) be a partition of W such that X,- is a clique and /I (X ; ) > 1/2 /or every 1 < i < 
77ze?i ?/zere jj a uniform concurrent flow of value Q.{X/k^) on [K\ , . . . ,Xfc). 



Proof. Suppose that there is no uniform concurrent flow of value j8 • X/ki , where /3 > is a sufficiently small universal 
constant specified later. This means that the dual linear program has a solution having value less than that. Let us 
fix such a solution (y,£ij) of the dual linear program. In the following, for every path P, we denote by y(P) := 
Y,eeE(H),ef]P^y{ e ) tne tota l weight of the edges intersecting P. It is clear from the dual linear program that y(P) > £jj 
for every P £ Vij. 

We construct two graphs G\ and G2: the vertex set of both graphs is {1, ... ,k} and for every 1 < i < j < k, vertices 
i and j are adjacent in G\ (resp., G2) if and only if tjj > 1/ (3k 2 ) (resp., £{j > 1 /k 2 ). Note that G2 is a subgraph of G\. 
First we prove the following claim: 

Claim: If the distance of u and v is at most 3 in the complement of G\, then u and v are not adjacent in G2. 

Suppose that uw\M>2V is a path of length 3 in the complement of G\ (the same argument works for paths of length 
less than 3). By definition of G\, there is a K u — X W] path P\, a X W] — K W2 path P2> and a X VV2 — X v path P3 such that 
y(Pi),y(P2),>'(^ > 3) < 1/ (3^ 2 )- Since X w , and X W2 are cliques, paths Pi and P2 touch, and paths P2 and Pt, touch. Thus 
by concatenating the three paths, we can obtain a K u — K v path P with y(P) <y(P\)+ y{Pi) + y (P3 ) < l/^ 2 , implying 
that u and v are not adjacent in G2, proving the claim. Note that the proof of this claim is the only point where we use 
that the X,'s are cliques. 

Let / : E(H) — > M + be defined by y'(e) := 3k 2 -y(e), thus / has total weight less than 3j8 • Xyk. Suppose first 
that G\ has a matching of size |"&/4]. Without loss of generality, assume that fJfc/4"| ) is an edge of G\ for every 
1 < i < fik/4]. This means that y' covers every X; — X^r^Mi path for every i. Therefore, by Lemma [631 / has weight 
at least (A /I) ■ ^/\k/A \ ■ (1/2) > 3j3 -Xyk, if j3 is sufficiently small, yielding a contradiction. 

Thus the size of the maximum matching in Gi is less than k/4, which means that there is a vertex cover Si of size 
less than k/2. Let S2 C Si contain those vertices of S\ that are adjacent to every vertex outside S\ in G\. We claim that 
S2 is a vertex cover of G2. Suppose that there is an edge uv of G2 for some u,v $ S2. By the definition of S2, either 
u Si, or there is a vertex wi Si such that « and w\ are not adjacent in Gi. Similarly, either v is not in Si, or it is not 
adjacent in Gi to some m>2 5i. Since vertices not in Si are not adjacent in Gi (as Si is a vertex cover of Gi), we get 
that the distance of u and v is at most 3 in the complement of Gi. Thus by the claim, u and v are not adjacent in G2. 

The total weight of y, which is less than j3 • X/ki, is an upper bound on any £{j. Furthermore, if i and j are not 
adjacent in G2, then we have £{j < 1 jk 2 . The number of edges in G2 is at most \S2\k (as S2 is vertex cover), hence we 
have 

I lij < \S 2 \k-p-X/k2 + Q (1A 2 ) < p ■ X\S 2 \/Vk + l/2, 



1 < 



l<i<j<k 
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which implies that \S 2 \ > Vk/(2pX). Let A := U,es 2 ^ and B := \J m K { ; we have jti(A) > |S 2 | • (1/2) > V£/(4j3A) 
and > (1/2) • (fc — \S\ |)) > &/4. As every vertex of S2 is adjacent in Gi with every vertex outside Si, assignment 
/ covers every A — B path. However, / has weight less than 3/3 -Xy/k < min{\/V (4/3A), &/4} (using that A < 1 and 
assuming that /3 is sufficiently small), contradicting the assumption that W is (ju, A) -connected. □ 

Intuitively, the intersection structure of the paths appearing in a uniform concurrent flow on cliques K\, K^is 
reminiscent of the edges of the complete graph on k vertices: if {11,71} fl {? 2 ,7 2 } 7^ 0, then every path of j, touches 
every path of Fi 2 j 2 . We use the following result from B2l . which shows that the line graph of cliques have good 
embedding properties. If G is a graph and q > 1 is an integer, then the blow up G^ q > is obtained from G by replacing 
every vertex v with a clique K v of size q and for every edge uv of G, connecting every vertex of the clique K u with 
every vertex of the clique K v . Let be the line graph of the complete graph on k vertices. 

Lemma 6.7 ( H421D . For every k > 1 there is a constant > such that for every G(V,E) with \E\ > n^ and no isolated 
vertices, the graph G is a minor of L% for q = |"130|is|/& 2 ]. Furthermore, a minor mapping can be found in time 
polynomial in the size of G. 

Using the terminology of embeddings, a minor mapping of G into l[ can be considered as an embedding from G 
to L/t where every vertex of L% appears in the image of at most q vertices, i.e., the vertex depth of the embedding is at 
most q. Thus we can restate Lemma 16/71 the following way: 

Lemma 6.8. For every k > 1 there is a constant n\ > such that for every G(V,E) with \E\ > n\ and no isolated 
vertices, the graph G has an embedding into L# with vertex depth 0(\E\/k 2 ). Furthermore, such an embedding can be 
found in time polynomial in the size ofG. 

Now we are ready to prove Theorem 16. II the main result of the section: 

Proof ( of Theorem \6. 1 D . By Lemma 1631 and Lemma [6761 for some k = H ( A \J con^ (H ) ) , there are cliques K\, . .., K% 
and a uniform concurrent flow Fjj (1 <i < j < k) of value e = Q.(X/k$) on (K\,. . . By trying all possibilities for 
the cliques and then solving the uniform concurrent flow linear program, we can find these flows (the time required for 
this step is a constant f(H,X) depending only on H and A) . Let wq be the smallest positive weight appearing in the 
flows. 

Let m = \E (G) \ and suppose that m>nk, for the constant n\ in Lemma lSTTl Thus the algorithm of Lemma l678l can be 
used to find a an embedding \\f from G to L# with vertex depth q = 0{m/k 2 ). Let us denote by vun (1 < i < j < k) the 
vertices of with the meaning that distinct vertices j, 1 and V{,- 2 j 2 j are adjacent if and only if {11,71} PI {12, J2} 7^ 0- 

We construct an embedding (j) from G to H the following way. The set <p(u) is obtained by replacing each vertex 
of v/jj-j G Y(u) by a path from the flow F,j (thus (j>(u) is the union of paths). We select the paths in such a 

way that the following requirement is satisfied: a path P of Fjj having weight w is selected into the images of at most 
\{q/e) • w] vertices of G. We set m H x sufficiently large that (q/s) - wq > I (note that q depends on m, but e and wo 
depends only on H and A). Thus if m > m H ^, then \{q/s) • w] < 2(q/e) • w. Since the total weight of the paths in Fij 
is e, these paths can accommodate the image of at least (q/e) ■ £ = q vertices. As each vertex vun of L# appears in the 
image of at most q vertices of G in the mapping i/a, we can satisfy the requirement. 

It is easy to see that if u\ and w 2 are adjacent in G, then <j>(ui) and 0(m 2 ) touch: in this case, there are vertices 
v {h,h} ^ V( u \)> v {h-j2} ^ W{ u i) that are adjacent or the same in (that is, there is a t G {i\ , j\] n {i 2 ,7 2 }), and the 
corresponding paths of Fi 1 j 1 and Fi 2 j 2 selected into 0(«i) and 0(« 2 ) touch, as they both intersect the clique K t . With a 
similar argument, we can show that 0(h) is connected. 

To bound the edge depth of the embedding <p, consider an edge e. The total weight of the paths intersecting e 
is at most 1 and a path with weight w is used in the image of at most 2{q/e) ■ w vertices. Each path intersects e 
in at most 2 vertices (as we can assume that the paths appearing in the flows are minimal), thus a path with weight 
w contributes at most 4(#/e) • w to the depth of e. Thus the edge depth of (j) is at most 4(q/e) = 0(m/ '(Xyk)) = 
0(m/(X 1 con A (#)?)). □ 
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6.3 Connection with adaptive width 

As an easy consequence of the embedding result Corollary I6.2[ we can show that large submodular width implies large 
adaptive width: 

Lemma 6.9. For every hypergraph H, adw(//) = £2(emb(//)). 

Proof. Suppose that emb(H) > a. This means that there is an integer m a such that every graph with m > m a edges 
has an embedding into H with edge depth m/a. It is well-known that there are arbitrarily large sparse graphs whose 
treewidth is linear in the number of vertices (for example, bounded-degree expanders, see e.g., 11291 ): for some universal 
constant j3, there is a graph G with m > m a edges and treewidth at least j6/n. Thus there is an embedding from GtoH 
with edge depth at most q <m/a. Let d (v) be the depth of vertex v in the embedding and let us define jU (v) :=d(v)/q. 
From the definition of edge depth, it is clear that /I is a fractional independent set. Suppose that there is a tree 
decomposition (T,B veV r T \) of H having /i-width w. This tree decomposition can be turned into a tree decomposition 
(T,B' veV , T j) of G: for every B t C V(H), let B' t := {u £ V(G) \ <j)(u)r\B t / 0} contain those vertices of G whose images 
intersect B t . Now n(B t ) < w means that £ v6 g f d{v) < qw, which implies that \B' t \ < qw. Thus the width of (T,B' veV ^) 
is less than qw, which means that w has to be at least f5m/q = 12(a), the required lower bound on the adaptive width 
ofH. □ 

Combining Theorem 15. H and Lemma [6791 gives : 

Corollary 6.10. For every hypergraph H, subw(//) = 0(adw(//) 4 ). 



7 From embeddings to hardness of CSP 

We prove the main hardness result of the paper in this section: 

Theorem 7.1. Let % be a recursively enumerable class of hypergraphs with unbounded submodular width. If there 
is an algorithm A and a function f such that A solves every instance I of CSP (H) with hypergraph H EH in time 
f(H) ■ ||/||°( subw ( ff ) 1/4 ), then the Exponential Time Hypothesis fails. 

In particular, Theorem 17.11 implies that CSP(%) for such a 7~L is not fixed-parameter tractable: 

Corollary 7.2. If% is a recursively enumerable class of hypergraphs with unbounded submodular width, then CSP(T-L) 
is not fixed-parameter tractable, unless the Exponential Time Hypothesis fails. 

The Exponential Time Hypothesis (ETH) states that there is no 2°^ time algorithm for ^-variable 3SAT. The 
Sparsification Lemma of Impagliazzo, Paturi, and Zane [35 ] shows that ETH is equivalent to the assumption that there 
is no algorithm for 3SAT whose running time is subexponential in the number of clauses. This result will be crucial 
for our hardness proof, as our reduction from 3SAT is sensitive to the number of clauses. 

Theorem 7.3 (Impagliazzo, Paturi, and Zane [35]). If there is a 2°( m ) time algorithm for m-clause 3 -SAT, then there is 
a 2°("' time algorithm for n-variable 3-SAT. 

To prove Theorem 17. 1[ we show that a subexponential-time algorithm for 3SAT exists if CSP(%) is can be solved 
"too fast" for some H. with unbounded submodular width. We use the characterization of submodular width from 
Section [5] and the embedding results of Section [6] to reduce 3SAT to CSP(%) by embedding the incidence graph of a 
3SAT formula into a hypergraph H The basic idea of the proof is that if the 3SAT formula has m clauses and the 
edge depth of the embedding is m/r, then we can gain a factor r in the exponent of the running time. If submodular 
width is unbounded in H, then we can make this gap r between the number of clauses and the edge depth arbitrary 
large, and hence the exponent can be arbitrarily smaller than the number of clauses, i.e., the algorithm is subexponential 
in the number of clauses. 

The following simple lemma from [42 ] gives a transformation that turns a 3SAT instance into a binary CSP instance. 
We include the proof for completeness. 



31 



Lemma 7.4. Given an instance of 3 SAT with n variables and m clauses, it is possible to construct in polynomial time 
an equivalent CSP instance with n + m variables, 3m binary constraints, and domain size 3. 

Proof. Let be a 3SAT formula with n variables and m clauses. We construct an instance of CSP as follows. The 
CSP instance contains a variable x, (1 < i < n) corresponding to the j-th variable of <p and a variable yj (1 < j < m) 
corresponding to the j-th clause of (p. Let D = {1,2,3} be the domain. We try to describe a satisfying assignment of <p 
with these n + m variables. The intended meaning of the variables is the following. If the value of variable x,- is 1 (resp., 
2), then this represents that the j-th variable of is true (resp., false). If the value of variable yj is i, then this represents 
that the j-th clause of is satisfied by its £-th literal. To ensure consistency, we add 3m constraints. Let 1 < j < m and 
1 < I < 3, and assume that the £-th literal of the j-th clause is a positive occurrence of the j-th variable. In this case, we 
add the binary constraint (x,- = 1 Vy 7 - ^ £): either x, is true or some other literal satisfies the clause. Similarly, if the £-th 
literal of the j-th clause is a negated occurrence of the j-th variable, then we add the binary constraint (x,- = 2 Vy ; - ^ £). 
It is easy to verify that if is satisfiable, then we can assign values to the variables of the CSP instance such that every 
constraint is satisfied, and conversely, if the CSP instance has a solution, then (j) is satisfiable. □ 

Next we show that an embedding from graph G to hypergraph H can be used to simulate a binary CSP instance I\ 
having primal graph G by a CSP instance 7 2 whose hypergraph is H. The domain size and the size of the constraint 
relations of 7 2 can grow very large in this transformation: the edge depth of the embedding determines how large this 
increase is. 

Lemma 7.5. Let I\ = (V\ ,D\ ,Ci) be a binary CSP instance with primal graph G and let be an embedding of G into 
a hypergraph H with edge depth q. Given I\, H, and the embedding 0, it is possible to construct (in time polynomial 
in the size of the output) an equivalent CSP instance h = (V 2 ,D 2 ,C 2 ) with hypergraph H where the size of every 
constraint relation is at most \D\ \ q . 

Proof. For every v G V(H), let U v := {u G V(G) | v G <p(u)} be the set of vertices in G whose images contain v, and 
for every e G E(H), let U e := \J vee U v . Observe that for every e G E(H), we have \U e \ < £ vGe \U V \ < q, since the edge 
depth of (j) is q. Let D 2 be the set of integers between 1 and \D\\ q . For every v G V(H), the number of assignments 
from U v to D\ is clearly iDil^'l < \Di\ q . Let us fix a bijection h v between these assignments on U v and the set 
{1,...,|D 1 |I J/ »I}CD 2 . ~ 

The set C2 of constraints of I2 are constructed as follows. For each e G E(H), there is a constraint (s e ,R e ) in 
C2, where s e is an \e\ -tuple containing an arbitrary ordering of the elements of e. The relation R e is defined the 
following way. Suppose that v; is the j-th coordinate of s e and consider a tuple t = (d\ , . . . , d\ e \ ) G D 2 e ' of integers where 
1 < di < iDil^"; for every 1 < i < \e\. This means that di is in the image of h Vi and hence f := h~ l (di) is an assignment 
from U Vj to D\. We define relation R e such that it contains tuple t if the following two conditions hold. First, we require 
that the assignments f\, f\ e \ are consistent in the sense that f(u) = fj(u) for any i,j and u G U Vj n U Vj . In this case, 

fi, f\ e \ together define an assignment / on Ijjfij U Vj = U e . The second requirement is that this assignment / satisfies 
every constraint of h whose scope is contained in U e , that is, for every constraint ((u\,U2),R) G C\ with {^1,^2} C U e , 
we have (f(u\ ),f(u2)) G R. This completes the description of the instance I2. 

Let us bound the maximum size of a relation of I2. Consider the relation R e constructed in the previous paragraph. 
It contains tuples (d\,.. .,d\ e \) G D 2 e ' where 1 < <i, < | 1 1 ' t ^ v ' f " ' for every 1 < i < \e\. This means that 

k 1 

\Re\ <Y[\ D l\ lUv ' 1 = |£>lF'= l|£/vil < \Dl\ q , (2) 

1=1 

where the last inequality follows from the fact that has edge depth at most q. 

To prove that Ii and I2 are equivalent, assume first that Ii has a solution f\ : V\ — > D\ . For every v G V2, let us define 
fi(v) := h v (pr Uv f2), that is, the integer between 1 and |Z>i 1 1 47 "! corresponding to the projection of assignment / 2 to U v . 
It is easy to see that / 2 is a solution of h- 

Assume now that h has a solution / 2 : V 2 — > D 2 . For every v G V(H), let f z := h~ 1 (f2(v)) be the assignment 
from U v to D[ that corresponds to / 2 (v) (note that by construction, / 2 (v) is at most iDil^, hence h~ 1 (f2(v)) is well- 
defined). We claim that these assignments are compatible: if u G U v i n U v » for some u G V(G) and v',v" G V(H), then 
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f v i(u) = f v "(u). Recall that <j>(u) is a connected set in H, hence there is a path between v' and v" in <p(u). We prove the 
claim by induction on the distance between V and v" in (j)(u). If the distance is 0, that is, V = v", then the statement is 
trivial. Suppose now that the distance of V and v" is d > 0. This means that V has a neighbor z £ ^(m) such that the 
distance of z and v" is d — 1. Therefore, = f v "{u) by the induction hypothesis. Since v' and z are adjacent in H, 
there is an edge E G containing both v' and z. From the way /2 is defined, this means that f v > and / z are compatible 
and f v i{u) = f z {u) = f v "(u) follows, proving the claim. Thus the assignments f v , v G V(H) are compatible and these 
assignments together define an assignment f\ : V(G) — > D. We claim that f\ is a solution of l\. Let c = ((u\,U2),R) 
be an arbitrary constraint of I\. Since mim' 2 G E(G), sets </>(wi) and 0(^2) touch, thus there is an edge e G E{Hk) that 
contains a vertex vi G 0(mi) and a vertex V2 G ty{u2) (or, in other words, u\ G C/ Vl and «2 G U V2 ). The definition of 
c e in ^2 ensures that f\ restricted to U n U U Vl satisfies every constraint of Ii whose scope is contained in U Vl U U n ; in 
particular, f\ satisfies constraint c. □ 

Now we are ready to prove Theorem 17. 1[ the main result of the section. We show that if there is a class H of 
hypergraphs with unbounded submodular width such that CSP(%) is FPT, then this algorithm can be used to solve 
3SAT in subexponential time. The main ingredients are the embedding result of Theorem 16. 1[ and Lemmas 17.41 and 
17 .5 1 above on reduction to CSP. Furthermore, we need a way of choosing an appropriate hypergraph from the set 7~L. As 
discussed earlier, the larger the submodular width of the hypergraph is, the more we gain in the running time. However, 
we should not spend too much time on constructing the hypergraph and on finding an embedding. Therefore, we use 
the same technique as in Il42l : we enumerate a certain number of hypergraphs and we try all of them simultaneously. 
The number of hypergraphs enumerated depends on the size of the 3SAT instance. This will be done in such a way that 
guarantees that we do not spend too much time on the enumeration, but eventually every hypergraph in H is considered 
for sufficiently large input sizes. 

Proof (of Theorem 1 7. 1 D . Let us fix a X > that is sufficiently small for Theorems 15 . 1 1 and 16. 1 1 Suppose that there is an 
f x (H)n°^ uh ™W^ time algorithm A for CSPCH). We can express the running time as /j (//)„subw(//) 1 / 4 /i(subw(//)) for 
some unbounded nondecreasing function i with i(l) > 1. We construct an algorithm B that solves 3SAT in subexpo- 
nential time by using algorithm A as subroutine. 

Given an instance / of 3SAT with n variables and m clauses and a hypergraph H G Ti, we can solve / the following 
way. First we use Lemma 1741 to transform / into a CSP instance I\ = (V\,D\,Ci) with |Vi| = n + m, \D\\ = 3, and 
|Ci | = 3m. Let G be the primal graph of /], which is a graph having 3m edges. It can be assumed that m is greater than 
some constant m H x of Theorem 16. II otherwise the instance can be solved in constant time. Therefore, the algorithm 
of Theorem 16. 1 1 can be used to find an embedding of G into H with edge depth q = 0(m/(X? con^//) 1 / 4 )); by 
Theorem 15. II we have q < cxm/ subwfT/) 1 / 4 for some constant cx depending only on A. By Lemma 1731 we can 
construct an equivalent instance h = (V2,£>2,C2) whose hypergraph is H. By solving h using the assumed algorithm 
A for CSP(%), we can answer if l\ has a solution, or equivalently, if the 3SAT instance / has a solution. 

We will call "running algorithm A [/,//]" this way of solving the 3SAT instance /. Let us determine the running 
time of A [I,H]. The two dominating terms are the time required to find embedding using the f(H,X)m°^ time 
algorithm of Theorem 17. II and the time required to run A on I2. The size of every constraint relation in I2 is at most 
\D1\1 = 3«, hence ||/ 2 || = 0((\E(H)\ + \V(H)\)3«). Let k = subw(//). The total running time of A[I,H] can be bounded 
by 

= / 2 (tf,A)-m°( 1 )-3^ m / I « 

for an appropriate function f2(H,X) depending only on H and A. 

Algorithm B for 3SAT proceeds as follows. Let us fix an arbitrary computable enumeration H\, H2, ... of the 
hypergraphs in %. Given an m-clause 3SAT formula /, algorithm B spends the first m steps on enumerating these 
hypergraphs; let Hi be the last hypergraph produced by this enumeration (we assume that m is sufficiently large that 
I > 1). Next we start simulating the algorithms A[I,Hi], A[I,H2], A{I,Hi] in parallel. When one of the simulations 
stops and returns an answer, then we stop all the simulations and return the answer. It is clear that algorithm B will 
correctly decide the satisfiability of /. 
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We claim that there is a universal constant d such that for every s, there is an m s such that for every m > m s , the 
running time of B is at most (m ■ 2 m l s ) d on an m-clause formula. Clearly, this means that the running time of B is 2°( m ) . 

Let k s be the smallest positive integer such that l(k s ) > s (as l is unbounded, this is well defined). Let i s be the 
smallest positive integer such that subw(//,J > k s (as % has unbounded submodular width, this is also well defined). 
Set m s sufficiently large that m s > f% (H^ , A ) and the fixed enumeration of % reaches in less then m s steps. This 
means that if we run B on a 3SAT formula / with m > m s clauses, then A [/,//,• J will be one of the I simulations started 
by B. The simulation of A[/,//, s ] terminates in 

f 2 {H is ,X)m°W . 3 cWi(subw(/fU) < m . m o(i) . yxm/s 

steps. Taking into account that we simulate t <m algorithms in parallel and all the simulations are stopped not later 
than the termination of A[I,Hi a ], the running time of B can be bounded polynomially by the running time of A [/,//, J. 
Therefore, there is a constant d such that the running time of B is at most (m ■ 2 m ' s ) d , as required. □ 

Remark 7.6. Recall that if is an embedding of G into H, then the depth of an edge e G E{H) is d^ (e) = EveWG) 1 (v) Pi 
e\. A variant of this definition would be to define the depth of e as d'Ae) = |{v G V(G) \ <j>(y)C\e / 0}|, i.e., if <j>(y) 
intersects e, then v contributes only 1 to the depth of e, not \(j> (v) Pi e\ as in the original definition. Let us call this variant 
weak edge depth, it is clear that the weak edge depth of an embedding is at most the edge depth of the embedding. 

Lemma 1731 can be made stronger by requiring only that the weak edge depth is at most q. Indeed, the only place 
where we use the bound on edge depth is in Inequality ©. However, the size of the relation R e can be bounded by 
the number of possible assignments on U e in instance I\. If weak edge depth is at most q, then \U e \ <q, and the \Di\ q 
bound on the size of R e follows. 

Remark 7.7. A different version of CSP was investigated in [44 ], where each variable has a different domain, and each 
constraint relation is represented by a full truth table (see the exact definition in [44]). Let us denote by CSP tt (%) this 
variant of the problem. It is easy to see that CSP tt (%) can be reduced to CSPCH) in polynomial time, but a reduction 
in the other direction can possibly increase the representation of a constraint by an exponential factor. Nevertheless, 
the hardness results of this section apply to the "easier" problem CSP tt (%) as well. What we have to verify is that the 
proof of Lemma 17 . 5 1 works even if I 2 is an instance of CSP tt , i.e., the constraint relations have to be represented by truth 
tables. Inspection of the proof shows that it indeed works: the product in Inequality (O is exactly the size of the truth 
table describing the constraint corresponding to edge e, thus the \Di\ q upper bound remains valid even if constraints 
are represented by truth tables. Therefore, the hardness results of B4l are subsumed by the following corollary: 

Corollary 7.8. IfH is a recursively enumerable class of hypergraphs with unbounded submodular width, then CSP tt (%) 
is not fixed-parameter tractable, unless the Exponential Time Hypothesis fails. 

8 Conclusions 

The main result of the paper is introducing submodular width and proving that bounded submodular width is the 
property that determines the fixed-parameter tractability of CSP(%). The hardness result is proved assuming the 
Exponential Time Hypothesis. This conjecture was formulated relatively recently ll35l . but it turned out to be very 
useful in proving lower bounds in a variety of settings Il42l l6l l43ll49ll . 

For the hardness proof, we had to understand what large submodular width means and we had to explore the 
connection between submodular width and other combinatorial properties. We have obtained several equivalent char- 
acterizations of bounded submodular width, in particular, we have showed that bounded submodular width is equivalent 
to bounded adaptive width: 

Corollary 8.1. The following are equivalent for every class H of hypergraphs: 

1. There is a constant c\ such that }X-width{H) < c\ for every H and fractional independent set }X. 

2. There is a constant C2 such that b-width{H) < C2for every H G 7i and edge-dominated monotone submodular 
function b on V(H). 
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3. There is a constant C3 such that b* -width{H) < c^for every H € H and edge-dominated monotone submodular 
function b on V(H). 

4. There is a constant C4 such that con^ (H) < c^for every H €T~L, where X > is a universal constant. 

5. There is a constant C5 such that emb(H) < csfor every H G 7i. 

Implications (2)=>(1) and (3)=K2) are trivial; (4)=^>(3) follows from Theorem 15.11 (5)=K4) follows from Corol- 
lary [621 (1)^(5) follows from Corollary loTTOl 

Let us briefly review the main ideas that were necessary for proving the main result of the paper: 

• Recognizing that submodular width is the right property characterizing the complexity of the problem. 

• A CSP instance can be partitioned into a bounded number of uniform instances (Section |4~TT ). 

• The number of solutions in a uniform CSP instance can be described by a submodular function (Section 14^21 . 

• There is a connection between fractional separation and finding a separator minimizing an edge-dominated 
submodular cost function (Section [5T2l . 

• The transformation that turns b into b* , the properties of b* (Section [5TTT >. 

• Our results on fractional separation and the standard framework of finding tree decompositions show that large 
submodular width implies that there is highly connected set (Section I53T ). 

• A highly connected set can be turned into a highly connected set that is partitioned into cliques in an appropriate 
way (Section IdTTI ). 

• A highly connected set with appropriate cliques implies that there is a uniform concurrent flow of large value 
between the cliques (Section I6T21 . 

• Similarly to 02), we use the observation that a concurrent flow is analogous to a line graph of a clique, hence it 
has good embedding properties (Section [6^21 . 

• Similarly to [42], an embedding in a hypergraph gives a way of simulating 3SAT with CSPCH) (Section [7]). 

It is possible that the main result can be proved in a simpler way by bypassing some of the ideas above. In particular, a 
surprising consequence of our results is that bounded submodular width and bounded adaptive width are the same, i.e., 
if a class T-L has unbounded submodular width, then for every k there is a G % and a fractional independent set pt^ 
such that ;U/i-width(/4) > k. To prove this, we need all the results of Sections [5] and [6] Having a better understanding 
and an independent proof of this fact could simplify the proofs considerably. Another possible target for simplification 
is Section 16.11 where a lot of effort is spent on proving that if there is a large highly connected set, then there is a 
large highly connected set that is partitioned into cliques in an appropriate way. It might be possible to strengthen the 
results of Section [5] (perhaps by better understanding the role of cliques in separators) so that they give such a highly 
connected set directly. 

An obvious question for further research is whether it is possible to prove a similar dichotomy result with respect 
to polynomial time solvability. At this point, it is hard to see what the answer could be if we investigate the same 
question using the more restricted notion of polynomial time solvability. We know that bounded fractional hypertree 
width implies polynomial-time solvability [41 ] and Theorem [7j] shows that unbounded submodular width implies that 
the problem is not polynomial-time solvable (as it is not even fixed-parameter tractable). So only those classes of 
hypergraphs are in the "grey zone" that have bounded submodular width but unbounded fractional hypertree width. 

What could be the truth in this grey zone? A first possibility is that CSP(%) is polynomial-time solvable for 
every such class, i.e., Theorem 14. II can be improved from fixed-parameter tractability to polynomial-time solvability. 
However, Theorem 14.11 uses the power of fixed-parameter tractability in an essential way (splitting into a double- 
exponential number of uniform instances), so it is not clear how such improvement is possible. A second possibility 
is that unbounded fractional hypertree width implies that CSP(%) is not polynomial-time solvable. Substantially new 



35 



techniques would be required for such a hardness proof. The hardness proofs of this paper and of ETl 1421 are based 
on showing that a large problem space can be efficiently embedded into an instance with a particular hypergraph. 
However, the fixed-parameter tractability results show that no such embedding is possible in case of classes with 
bounded submodular width. Therefore, a possible hardness proof should embed a problem space that is comparable 
(in some sense) with the size of the hypergraph and should create instances where the domain size is bounded by 
a function of the size of the hypergraph. A third possibility is that the boundary of polynomial-time solvability is 
somewhere between bounded fractional hypertree width and bound submodular width. Currently, there is no natural 
candidate for a property that could correspond to this boundary and, again, the hardness part of the characterization 
should be substantially different than what was done before. Finally, there is a fourth possibility: the boundary of 
the polynomial-time cases cannot be elegantly characterized by a simple combinatorial property. In general, if we 
consider the restriction of a problem to all possible classes of (hyper)graphs, then there is no a priori reason why an 
elegant characterization should exist that that describes the easy and hard classes. For example, it is highly unlikely 
that there is an elegant characterization of those classes of graphs where solving the Maximum Independent Set 
problem is polynomial-time solvable. As discussed earlier, the fixed-parameter tractability of CSP(%) is a more robust 
question than its polynomial-time solvability, hence it is very well possible that only the former question has an elegant 
answer. 
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